stcorp / harp

Data harmonization toolset for scientific earth observation data
http://stcorp.github.io/harp/doc/html/index.html
BSD 3-Clause "New" or "Revised" License
55 stars 18 forks source link

Allow patterns for input files for harpcollocate #268

Closed StevenCompernolle closed 2 years ago

StevenCompernolle commented 2 years ago

The way to specify input seems to be different for harpcollocate compared to e.g., harpmerge.

harpmerge allows for patterns, e.g., /path/to/*.nc But this does not seem to be possible for harpcollocate

My use case is as follows: I have several .nc files (which I want as input to harpcollocate) and several .pth files (which I do not want as input) at the same depth, below the same root. Specifying the top directory would make both the .nc files and the .pth files become input to harpcollocate, something I want to avoid.

A small remark. In the harpmerge documentation it is not mentioned that files can be specified by using patterns.

svniemeijer commented 2 years ago

The patterns are interpreted by your linux shell, not by harp. Your shell turns it into a list of individual paths. This results in harpmerge or harpcollocate to be invoked with multiple paths as arguments. As you can imagine, providing a concatenated list of two datasets to harpcollocate, will prevent harpcollocate from splitting them up again into the two distinct sets that need to be collocated. So what you are asking for is not possible (unless we fundamentally change the argument parsing approach).

Given your use case, why not just create .pth files for your *.nc dataset? You can export the result of harpdump --dataset directly to a .pth file, so you could even script it.