DataONEorg / d1_pypackage

Implements a python tool for serializing data packages from DataONE
Apache License 2.0
0 stars 2 forks source link

compare to datapackage in R for similar API #3

Open mbjones opened 9 years ago

mbjones commented 9 years ago

Design this python implementation to be compatible with the corresponding R datapackage package. The datapackage implementation in R is under development, and has an API that includes classes for DataObject, DataPackage, ResourceMap, and SystemMetadata. It would be nice if there was a correspondence in the APIs.

datadavev commented 9 years ago

This project is primarily to support serialization of the package to a local file system such that it may be reliably used by other tools / services. Much of the implementation (e.g. classes for System Metadata, Resource Maps, and Data Object) has been available in the DataONE python libraries for some time - it would be interesting to evaluate the R implementation in that context.

mbjones commented 9 years ago

Yeah, in R we decided to make dataone depend on datapackage (rather than the inverse), so that other non-dataone R code could also use and create data packages for other systems. This meant that SystemMetadata was defined in datapackage, and the dataone library then imports that and adds all of the service calls. So R's datapackage is a standalone data package representation that can serialize to BagIt with SystemMetadata and does not depend on having access to any other DataONE services. It would be nice if the BagIt-based package format was standalone on all platforms, even though it means some refactoring. Open to discussion...