Closed scisco closed 7 years ago
@rhartran for your review
Is the granuleId assigned by Cumulus? Or, does it come from the data provider?
What time format is this using? How hard is it to convert to a local time when viewing? Would it be easier to use "date" data type?
Would it help to introduce a "file type" in the files section? This could match the file type in the PDR and make it easier for jobs like: tell me the URL for the metadata file, or browse file.
@rhartran my responses below:
Is the granuleId assigned by Cumulus? Or, does it come from the data provider?
It is extracted from the filename using a Regex set by the operator. Obviously there is an assumption here that all files of a collection follow a specific naming convention.
What time format is this using? How hard is it to convert to a local time when viewing? Would it be easier to use "date" data type?
We use epoch unix time, because DynamoDb doesn't have the concept of date. It supports string and number. The conversion to local time shouldn't be much of a problem. unix time handles it quite well.
Would it help to introduce a "file type" in the files section? This could match the file type in the PDR and make it easier for jobs like: tell me the URL for the metadata file, or browse file.
It does help. Although I think the right place for the fileType would be in the collectionTable not in the granuleTable. What do you think?
Comments preceeded by RMH:
Bob
It is extracted from the filename using a Regex set by the operator. Obviously there is an assumption here that all files of a collection follow a specific naming convention.
RMH: We should check with the DAACs to make sure that is OK. I think that algorithm would not support granule replacement scenarios where Cumulus needs to keep the old instance of the granule and add the new instance also. For this use case, SDPS offers options to automatically replace the granule in the public/protected directory and move the old instance to a hidden/private directoy – or keep the new instance in hidden and let the DAAC manually choose to replace the old instance. Another less likely case is that files from different collections map to the same granuleId given the regex. I think many systems assign their own “unintelligent” key to each granule as it is received.
It does help. Although I think the right place for the fileType would be in the collectionTable not in the granuleTable. What do you think?
RMH: I think some of it depends how you model a granule. If you model a granule as a set of files (science, browse, metadata, QA, PH) then having a file type associated with each file in the granule seems like an important thing to have. It will allow Cumulus to offer more service when distributing files. If you model a granule as just the science file and have other entities such as browse, metadata, QA, and PH that are associated with the science granule then storing a file type is not necessary.
GranulesTable
Example Granule Record