nasa / eo-metadata-tools

eo-metadata-tools is a set of repositories for working with Earth Observation metadata. At its core are common libraries and demonstration scripts for accessing the Common Metadata Repository, to be accompanied by more specific modules and scripts to do dataset-specific queries, metadata validation, etc...
Apache License 2.0
25 stars 11 forks source link

Granule search based on its filename #18

Open slesaad opened 3 years ago

slesaad commented 3 years ago

It would be useful for scientists to be able to search for a granule based on its filename. Currently, there is no way to do it.

jceaser commented 3 years ago

Are you talking about the GranuleMetaDataFile field?

clynnes commented 3 years ago

To clarify: "As a scientist, I can paste the filename of a data file I am looking at into some TBD tool, and search CMR for the granule metadata record corresponding to that data file." (Assuming that I have not changed the filename from what it was when I originally downloaded it, natch.) Currently, this is possible only if you also supply some other qualifier, like the provider, or the collection conceptId. Which the scientist does not always have readily available.

jceaser commented 3 years ago

that sounds like the echo onlineaccessurl AKA the related URL field

clynnes commented 3 years ago

Mmmm...I don't think so. We have only the filename, not the front part of the URL. Slesa and I were planning to loop through the providers until we found the filename in the readable_granule_id. (Seemed more efficient than looping through all 7500 collections until we found it.)

clynnes commented 3 years ago

(I do have a wild and crazy idea about inferring filename regex patterns, adding them as tags to the collection, and then searching through the collections until we find the regex matching the filename, then querying that collection.)

jceaser commented 3 years ago

I don't see any granule field which accepts wild cards, I do find ranges. Partial field search would require some discussion I think.

clynnes commented 3 years ago

No, that's not where the regex comes in. The regex is used to figure out which collection a data file came from. We can then search for the granule by its exact filename within a single collection.