NCAR / intake-esm-datastore

Intake-esm Datastore
Apache License 2.0
14 stars 11 forks source link

Picking the latest version fails for cmip.py builder #69

Closed jbusecke closed 4 years ago

jbusecke commented 4 years ago

I have built a catalog file using cmip.py with the --pick-latest-version flag. I am still seeing duplicate versions in my catalog file.

I experimented with the _pick_latest_version

locally and was able to rectify my dataframe by adding the field 'dcpp_init_year' to the fields that are excluded from the groupby call:

grpby = list(set(df.columns.tolist()) - {'path', 'version', 'dcpp_init_year'})

In my case the dcpp_init_year is populated by nans, which might throw of pandas groupby.

andersy005 commented 4 years ago

Thank you for posting this, @jbusecke!

In my case the dcpp_init_year is populated by nans, which might throw of pandas groupby.

This is one thing I overlooked in my previous implementation.

andersy005 commented 4 years ago

I will update the script

jbusecke commented 4 years ago

Would be happy to submit a PR, but prob need some help constructing the tests.

andersy005 commented 4 years ago

As an update, I addressed this issue in #71.

Would be happy to submit a PR, but prob need some help constructing the tests.

Constructing tests for the existing utilities is a reasonable thing to do. It just hasn't been done yet. It would nice to have some tests in the future.

jbusecke commented 4 years ago

I just tested the new PR and can confirm that it works. Thanks.