Reading-eScience-Centre / pycovjson

Create CovJSON files from common scientific data formats
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

pycovjson-convert unable to read multiple variables on CLI #29

Open lewismc opened 7 years ago

lewismc commented 7 years ago

When I invoke the following all is good

lmcgibbn@LMC-056430 ~/Desktop $ pycovjson-convert -i osu_cioss_weekly_msla_geovel_2012_v1.nc.gz.nc -o osu_cioss_weekly_msla_geovel_2012_v1.nc.gz.nc.covjson -v sea_surface_height_above_sea_level
Error occured: Variable "sea_surface_height_above_sea_level" has no axis attribute
Error occured: Variable "sea_surface_height_above_sea_level" has no axis attribute
Converting....
Completed in:  0.03203699999999998 seconds.

however, when I attempt to invoke with multiple parameters it appears that the CLI is not built such that the parameters are interpreted correctly

mcgibbn@LMC-056430 ~/Desktop $ pycovjson-convert -i osu_cioss_weekly_msla_geovel_2012_v1.nc.gz.nc -o osu_cioss_weekly_msla_geovel_2012_v1.nc.gz.nc.covjson -v sea_surface_height_above_sea_level surface_geostrophic_northward_sea_water_velocity

How do I pass multiple variables ? The code suggests that I should be able to pass a List of variables however this is never interpreted from the CLI. I read the covjson specification and one MAY write multiple parameters. Can you clarify the position here @RileyWilliams and then I'll go ahead and submit a PR to resolve.

RileyWilliams commented 7 years ago

@lewismc this is not yet implemented but is a relatively quick fix, this part of write.py is responsible for writing variables to a range. I aim to implement this soon.

letmaik commented 7 years ago

Keep in mind that writing multiple parameters to a file is only useful if the total file size doesn't blow up too much. Otherwise, linking each range via a URL and writing separate range documents makes more sense. This should be an option, something like --external-data. @lewismc Which scenario would make sense for you?

lewismc commented 7 years ago

Cool. I'm more than happy to code it, let me know. Ta

lewismc commented 7 years ago

In all honesty, I think the external data one and writing external documents. The issue of CovJSON documents blowing up quickly is one that we are all very much aware of.

lewismc commented 7 years ago

@letmaik @RileyWilliams I am thinking that -a/--all should be an option here. As you mentioned @letmaik documents get very large, very quickly so I think that there should be a variable substitution in the -o/--output filename something like

CLI invocation

pycovjson-convert -i nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4 -a -o nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4.covjson

The above invocation would write everything into one large file, whereas the following

pycovjson-convert -i nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4 -a -e -o nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4.covjson

would write one coverage for each variable e.g. the output files would be nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4-${variable1}.covjson, nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4-${variable2}.covjson, ..., nc_20160304-DMI-L4UHfnd-NSEABALTIC-v01-fv01-DMI_OI.nc.gz.nc4-${variableN}.covjson, etc.

In the above example the -a/--all parameter indicates that every variable should be processed. The -e/--external-data variable indicates that each variable should be written to a separate file.

Does this sound logical?