Closed pabloarosado closed 2 years ago
Thanks @bnjmacdonald, I was trying to follow the philosophy of the energy-data repos. But I agree with you that it would be better to use importers or etl to upload the CAIT datasets to grapher. I could either do it now and cancel this PR, or leave it for a future improvement. I don't have a strong opinion on which option is better, @edomt ?
Indeed the general pipeline isn't optimal – but of course, it was already like that before. There are several things to take into account, including that:
importers
will be deprecated in the coming monthsetl
isn't fully ready yet, so even an implementation of this pipeline in etl
would need to be revisited at some pointWhat I would suggest to get rid of the confusion mentioned by @bnjmacdonald is that:
prepare_cait_datasets.py
& its output into a cait
folder in importers
in its current state, i.e. without refactoring it as a "true" importer that upserts data into Grapher (because that part will be deprecated soon). So something that looks like the population folder.main.py
in this co2 repo then fetches that Grapher dataset to use itAnd at the next update of CAIT (presumably next year?), we'll transform the whole thing into a proper etl
pipeline.
Let me know what you think :)