mbari-org / SeafloorMappingDB

Make MBARI seafloor mapping datasets more accessible and useful
GNU General Public License v3.0
3 stars 6 forks source link

Record publication citations and DOIs #82

Open MBARIMike opened 3 years ago

MBARIMike commented 3 years ago

From the Use Case document:

Record the submission of data to repositories and the DOIs received. Record publication citations and DOIs resulting from the data.

jbpaduan commented 2 years ago

Adding little files within, or concatenating to Notes files in, the survey directories could become really messy for recording the DOIs issued by data repositories and the citations in manuscripts. As I've now done a few of those MGDS submissions and we've published on even more of the surveys, I'm inclined to do it differently. I think it would be more manageable if maybe there is a series of larger, spreadsheet-like files outside the survey directories that gets added to over time.

One particularly messy scenario, whether it's many little files or even a single file, is that if we have multiple versions of data at MGDS (say, we renavigate old surveys with new and/or add to a compilation - think our on-going work at Pescadero Basin or Axial - and send MGDS new processed files or a compilation's GIS products), they will issue another set of DOIs. We'd have to both put them in new little files and to append to existing little files, if little files in each survey directory was the way we did it. I fear it would become very difficult to manage.

Another complicating issue is that multiple DOIs are issued for each survey. There is one for each data type (the raw .s7k files, the processed *p.mb88 files, the CTD data, the main vehicle computer logs, and documentation (notes file and process.cmd); and then there are separate DOIs for each of the GIS products of the compilation made from all the surveys in that submission. (It's kind of crazy how it has evolved to be so many DOIs, but they've had to adapt their system to our data.)

As for citations, there is likely to be a many to many relationship here, too. In a static DB, this could be dealt with smoothly, but can't be done so easily with lots of little files in many survey directories.

MBARIMike commented 2 years ago

I think we should go forward with a set of spreadsheet-like files outside the survey directories. We can develop export/import software that imposed the database structure on an exported spreadsheet, allowing for simple editing there. Then we can import those changes back to the database with the goal of keeping them always synchronized.

jbpaduan commented 4 months ago

Names of MGDS compilations are now a column in the new /SMDB/*survey_tally.xlsx files; URLs to the compilations are adequate in lieu of DOIs, because it turns out MGDS creates DOIs for each data type in a submission, resulting in multiple DOIs per survey. The prefix for the compilations should be like, https://www.marine-geo.org/tools/search/entry.php?id= (as in https://www.marine-geo.org/tools/search/entry.php?id=EscanabaTrough_MBARI)

jbpaduan commented 4 months ago

Still need to ponder how to deal with citations, since they may be a many-to-many relationship.