Like DarwinCore Archive, but using SQLite
The Darwin Core Archive format is widely for biodiversity informatics (within the GBIF community for example), and has largely shown its effectiveness as an exchange format.
Due to its nature (basically, a bunch of CSV files zipped together) it is however a poor candidate for data use and analysis: many users get data in Darwin Core Archive, but then immediately extract data from the Archive and transfer it to some custom/non-standard format that's easier to manipulate, such as a spreadsheet or a relational database.
The aim of DarwinSQL is to propose an alternative, standardized file format that can be used for exchanging data, but also for simple data use and analysis.
There are two main milestones to this project:
Defining a new file format with the following characteristics:
Providing a basic implementation of this format:
Work in progress, subject to change at any time, comments are welcome!
info
table describing the content of the DarwinSQL in a standardized format, similar to the
Metafile
of a Darwin Core Archives.files
table that hold the raw content of each file of the Archive, except the data (CSV) files. 2 fields:
path
(relative path of the file in the originating Darwin Core Archive) and content
(BLOB)