spase-group / spase-base-model

Specification for the SPASE base information model.
Apache License 2.0
1 stars 0 forks source link

Add fields in Access for filenaming and directory templates #25

Closed candeynasa closed 11 months ago

candeynasa commented 1 year ago

Add fields in Access for filenaming and directory templates to store the patterns for the data filenames and directory structure, using the URI template specification https://github.com/hapi-server/uri-templates/wiki/Specification. This will enable automated search of files matching a time range. SPDF has its datasets described in all.xml using the "%" specification of strptime, which can migrated to the "$" specification.

For instance, all.xml has:

<access filenaming="ac_or_def_%Y%m%d_%Q.cdf" protocol="https" subdividedby="%Y" timerange_start="1997-08-26 00:00:00" timerange_stop="2023-05-31 00:00:00">
<URL>https://cdaweb.gsfc.nasa.gov/pub/data/ace/orbit/level_2_cdaweb/def_or</URL>

and would use SPASE fields: access_filenaming_template: "ac_ordef$Y$m$d_v$v.cdf" access_directory_template: "$Y" URL: "https://spdf.gsfc.nasa.gov/pub/data/ace/orbit/level_2_cdaweb/def_or/"

Definition: access_filenaming_template: Filenaming templates to store the patterns for the data filenames structure, using the URI template specification https://github.com/hapi-server/uri-templates/wiki/Specification. This will enable automated search of files matching a time range. For example, the ACE definitive orbit dataset at NASA SPDF has files following the naming pattern "ac_ordef$Y$m$d_v$v.cdf", where $Y is the year, $m is the month, $d is the day, and $v is the version number.

access_directory_template: Directory hierarchy templates to store the patterns for the data directory structure, using the URI template specification https://github.com/hapi-server/uri-templates/wiki/Specification. This will enable automated search of files matching a time range. For example, the ACE definitive orbit dataset at NASA SPDF has yearly subdirectories for the data files following the naming pattern $Y.

Both are optional and go in the Access information sections, since they are specific to repositories.

jvandegriff commented 1 year ago

If a SPASE record had a URI template in it, then it would be possible to build a file listing service automatically (for all SPASE records that had it).

candeynasa commented 1 year ago

Original FTP finder Python code

jbfaden commented 1 year ago

I have Java, Python, and JavaScript codes which support these templates: https://github.com/hapi-server/uri-templates

lfb12345 commented 1 year ago

In the example:

"ac_ordef$Y$m$d_v$v.cdf", where $Y is the year, $m is the month, $d is the day, and $v is the version number.

wouldn't $d not be resolved due to fact that '_v' is appended yielding $d_v?

jbfaden commented 1 year ago
The syntax is either "$L" where L is a single letter, or $(LETTERS) where LETTERS can be a 
multi-letters word.  So if there are no parenthesis, then "d" identifies the field and 
everything up to the next "$" is the separator.  Note the $(LETTERS) form must be used when 
qualifiers are used with a field, so for example you might use $(d;pad=none).

(backticks used so that LaTeX is not used.)

lfb12345 commented 1 year ago

I will be adding these tomorrow.

lfb12345 commented 1 year ago

done

jbfaden commented 12 months ago

This still needs to be done. During the SPASE meeting we looked on the website for the change and didn't see it. AccessDirectoryTemplate does appear in the draft which Lee sent out on 2023/07/19 18:45 UT email.

tressahelvey commented 11 months ago

@lfb12345 @jbfaden Please confirm if this task is complete. Thank you. Tressa

jbfaden commented 11 months ago

I can confirm this is complete.