owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
86 stars 23 forks source link

wizard: collect bugs / improvements #1563

Open lucasrodes opened 1 year ago

lucasrodes commented 1 year ago

After #1539 you can now use the new tool wizard to generate templates for your ETL steps.

Use this issue to report bugs that you may encounter. If the issue is very complex, feel free to create a separate - and more extense - issue.

pabloarosado commented 1 year ago

I'm using wizard now for the first time, so I'll write here all issues I find along the way:

lucasrodes commented 1 year ago

These issues should be solved in https://github.com/owid/etl/pull/1567

lucasrodes commented 1 year ago

@Marigold @pabloarosado From our discussions and the documentation on Notion, I had assumed that we agreed that the field license should be under $.meta.origin.

However, from what Pablo is saying, it sounds like this is raising an error. As a solution for now, I've set wizard to export the field license both at $.meta.origin.license and $.meta.license levels.

Do you think this is fine?

Marigold commented 1 year ago

It took me a while to realise that the produced .dvc file had license indented too many spaces, hence becoming an item of of origin. I think they are meant to be separate fields in the snapshot metadata.

The original idea was to keep it under origin, not separate from it. license is there only for backward compatibility. Ideally, snapshot meta would have either source & license or origin & origin.license (we could refactor old files if this causes too much confusion).

lucasrodes commented 1 year ago

Thanks for clarifying, Mojmir. Yes, that's what I had in mind.

@pabloarosado could you share code to reproduce the error?

lucasrodes commented 1 year ago

walkthrough bug that might be occurring here too: https://github.com/owid/etl/issues/1571

paarriagadap commented 1 year ago
paarriagadap commented 1 year ago

Two related comments to the ones above:

In general it would be good to match that file with the final version of the guidelines.

lucasrodes commented 1 year ago

Hey! Should I just reopen for a small issue then?

  • [x] The create playground notebook option in wizard doesn't generate a Jupyter Notebook

(copied from https://github.com/owid/etl/issues/1566#issuecomment-1723433161)

spoonerf commented 1 year ago

It's not super clear to me whether I should be hitting return after filling in each section or not? It seems like a good thing to somewhat validate each entry, but also the form completed unexpectedly early when pressing it, meaning I skipped the last couple of sections.

paarriagadap commented 1 year ago
# NOTE: To learn more about the fields, hover over their names.

# Learn more about the available fields:
# http://localhost:8000/architecture/metadata/reference/dataset/
dataset:
  update_period_days: 365

# Learn more about the available fields:
# http://localhost:8000/architecture/metadata/reference/tables/
paarriagadap commented 1 year ago

Harmonize country names with the following command (assuming country field is called country). Check out a short demo of the tool

lucasrodes commented 1 year ago

@spoonerf, good point, but unfortunately, clicking on ENTER is equivalent to submitting the entire form. It will be submitted if the form is valid (i.e., all required fields are present).

lucasrodes commented 1 year ago

@pabloarosado

I think the options to add regions and population data available on walkthrough should also be available on wizard as well.

Which options do you refer to here? I just executed walkthrough and there aren't - at least on my end - any options to add regions or population datasets.

paarriagadap commented 1 year ago

@lucasrodes Maybe it was removed recently, but they were available as check boxes at the # end of walkthrough garden

lucasrodes commented 1 year ago
paarriagadap commented 1 year ago
paarriagadap commented 1 year ago

Found here https://github.com/owid/etl/pull/1825#discussion_r1367034367

paarriagadap commented 1 year ago
paarriagadap commented 1 year ago

UPDATE: the current recommended workflow is based on chart-diff. So won't be addressing this issues.


I saved this draft chart to replicate (indicator here).

lucasrodes commented 12 months ago
paarriagadap commented 12 months ago
image
paarriagadap commented 10 months ago

EDIT: Also with the migrations

lucasrodes commented 9 months ago
paarriagadap commented 9 months ago

Probably there's not much to do, but

I see that the csv here includes .. for some data points.

lucasrodes commented 9 months ago

@paarriagadap, thanks for reporting. This issue is, as you mention, because there are non-numeric values in the dataset. Currently, the explore mode only supports comparing numerical values. We could compare categorical ones in the future if we needed to.

However, in this case, I think the indicators should be numbers and .. should be NaN instead. That way, the indicator would be recognised as numeric. Also, why do we have .. there? I bet that's a legacy dataset, and we won't be fixing this (removing ..). Unfortunately, implementing some logic to replace .. so that we can compare the data is too specific and don't think we should do it.

paarriagadap commented 9 months ago
lucasrodes commented 8 months ago

Chart revision:

pabloarosado commented 8 months ago

I see that some classic issues with the chart revision tool still happen occasionally:

It's not urgent, and I have already manually fixed the affected cases. I just mentioned them here so we are aware that these minor things still happen.

Marigold commented 8 months ago

I'll be looking at chart revisions this week, so I will take a look if there's an easy fix for that. (Examples like that are super useful by the way)

lucasrodes commented 8 months ago

Issue raised by pablo might be related to this one: https://github.com/owid/etl/issues/867 (meant to post this here)

cc. @Marigold

paarriagadap commented 7 months ago

Is the sorting in bar charts in the approval tool fixed here? Because I still find the issue:

image
paarriagadap commented 7 months ago
image image
paarriagadap commented 7 months ago
paarriagadap commented 6 months ago
lucasrodes commented 6 months ago

When using wizard snapshots, if I save the title with : it generates an error in the yaml file. It can be solved if the content is rendered in quotes.

Thanks for reporting @paarriagadap. I think this by design, otherwise the linter gets confused with the keyword. As you say, you can solve this by putting the text in quotes (") or using a multiline string (|-, >, etc.)

paarriagadap commented 6 months ago
Marigold commented 6 months ago

@paarriagadap fixed the expert, thanks for reporting!

paarriagadap commented 6 months ago
paarriagadap commented 6 months ago
paarriagadap commented 6 months ago
stale[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

paarriagadap commented 6 days ago

Indicator Upgrader didn't recognize the version of a dataset updated this year: PR Image

lucasrodes commented 6 days ago

Indicator Upgrader didn't recognize the version of a dataset updated this year: https://github.com/owid/etl/pull/3569

@pabloarosado, in case you want/have time to look into the detection algorithm

pabloarosado commented 5 days ago

Hi @paarriagadap thanks for reporting this, I've fixed it in your branch. There were two issues:

paarriagadap commented 5 days ago

Hi @pabloarosado. Ah, thanks for the fix. I should have forgotten about adding the dag file in the archive main.yml

Regarding those latest steps, yes, I think that when I use the tool to update steps, these latest steps are moved to the archive and disappear from the "live" yaml. At least I have seen that I need to re-add those steps to the latter.

pabloarosado commented 5 days ago

Hi @pabloarosado. Ah, thanks for the fix. I should have forgotten about adding the dag file in the archive main.yml

Regarding those latest steps, yes, I think that when I use the tool to update steps, these latest steps are moved to the archive and disappear from the "live" yaml. At least I have seen that I need to re-add those steps to the latter.

I guess this could have been caused by the other issue, of not having the dag/archive/poverty_inequality.yml file properly accounted for. But please let me know if something like this happens again in the future.