EDIorg / EMLassemblyline

R package for creating EML metadata
https://ediorg.github.io/EMLassemblyline/
MIT License
27 stars 13 forks source link

Allow physical metadata to be defined in a template #107

Open clnsmth opened 2 years ago

clnsmth commented 2 years ago

make_eml() requires access to data files so physical metadata (e.g. file size, checksum, etc.) can be calculated and the user doesn't have to deal with this minutiae. In some cases this requirement is unnecessarily prohibitive (e.g. large unchanged data files stored in a remote location) or prevents the user from overriding EAL assumptions (e.g. formatName estimated from file extension).

A possible solution is to store this physical metadata in a new metadata template that can be created before running make_eml() or left absent in the case where a user wants the values calculated for them, as is currently done.

joefutrelle commented 2 years ago

This solution seems workable to me as we could produce and maintain the template in an automated fashion and then simply drop it in place in our assembly line workflow. It would also allow us to track the template in our GitHub repositories for each package.

clnsmth commented 2 years ago

The proposed template is https://docs.google.com/spreadsheets/d/1dMbGtbmfVzUWUR0TyMUFHtkdrIckXc_-nRwKpoqkUmY/edit?usp=sharing

joefutrelle commented 2 years ago

Looks good to me. In the scenario where we transfer files to EDI out of band (because we can't host them), would the other entity URL be left blank? Or does EML require it?

clnsmth commented 2 years ago

You're right @joefutrelle, in that case the URL field would be blank or NA so the EDI Repository wouldn't try downloading the associated data object.

joefutrelle commented 2 years ago

Perfect.