2DegreesInvesting / tiltToyData

Toy datasets for tilt
https://2degreesinvesting.github.io/tiltToyData/
GNU General Public License v3.0
0 stars 0 forks source link

Features of the perfect toy datasets #24

Closed maurolepore closed 5 months ago

maurolepore commented 8 months ago

We recently improved the toy datasets for emissions profile (#19). However there seem to be still some details to improve (https://github.com/2DegreesInvesting/tiltToyData/pull/19#issuecomment-1889276566).

@AnneSchoenauer and @Tilmon, please share your list of required features (for inspiration see the "Features" list in this item of the changelog. I'll add your list here as a check-box tasklist and create separate issues to address each request.

From conversations in 2DegreesInvesting/tiltToyData#19 I see that we spent a lot of effort trying to ensure the privacy of licensed data. That effort is only necessary if we base our toy datasets in real data. While realism might be valuable it seems important to weight if it's worth the risk of exposing private data, and worth the kind of effort we put in 2DegreesInvesting/tiltToyData#19. If we can indeed sacrifice realism in some sensitive columns, then we may simply populate them with totally fake values.

toy_emissions_profile*

Datasets:

Features:

toy_sector_profile*

AnneSchoenauer commented 8 months ago

@Tilmon is it fine if you take this as you know better the licenses issues - however let me know if I should give it a trial! :)

Tilmon commented 8 months ago

Hi @maurolepore , thanks for this.

I see following 2 features that are not covered yet, in regards to the two upstream indicators:

Is that clear enough / does that make sense?

Thanks!

maurolepore commented 8 months ago

input_activity_uuid_product_uuid can be reproduced from these columns -- which therefore can't be shared together.

Tilmon commented 8 months ago

@maurolepore please correct to

activity_uuid_product_uuid can be reproduced from these columns -- which therefore can't be shared together.

ei_activity_name ei_reference_name main_activity geography

and

input_activity_uuid_product_uuid can be reproduced from these columns -- which therefore can't be shared together.

ei_input_activity_name ei_input_reference_name input_main_activity input_geography

cc' @kalashsinghal

kalashsinghal commented 8 months ago

@maurolepore Renaming the columns:

For activity_uuid_product_uuid:

ei_activity_name reference_product_name main_activity product_geography

and

For input_activity_uuid_product_uuid:

input_ei_activity_name input_reference_product_name main_activity input_geography

cc' @Tilmon

Tilmon commented 8 months ago

Hi @kalashsinghal I just checked tiltIndicatorAfter for profile_emissions_upstream and the variable is called: matched_reference_product

Tilmon commented 8 months ago

*That's the output name, not sure how it's called in your input data. But hopefully helps to identify the right column?

kalashsinghal commented 8 months ago

@Tilmon My bad. It's called reference_product_name. I have updated my comment here: https://github.com/2DegreesInvesting/tiltToyData/issues/24#issuecomment-1895546670

Tilmon commented 5 months ago

Hi @maurolepore I think we can close this issue now, right? We decided that we will use the toyData the DT developes for tiltIndicatorBefore. The idea would be to then use the output of tiltIndicatorBefore based on the toyData for the tiltIndicator package etc.