ERDDAP / erddap

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
Creative Commons Zero v1.0 Universal
78 stars 57 forks source link

BagIt Download #64

Closed ThomasThelen closed 2 years ago

ThomasThelen commented 2 years ago

Hi,

I was browsing though the source code and ran across a BagIt download implementation. I'm not totally sure-but is there an extra write of the data to the filesystem? I saw this comment. If so, I solved this problem in DataONE's data server by creating SpeedBagIt which gets around duplicating data for serving datasets. If so, I'd be more than happy to optimize the BagIt routine here.

BobSimons commented 2 years ago

Thanks for the offer, but... My understanding is that your system requires that the files to be included in the BagIt file already exist. You thus make the BagIt file and serve it to the user on-the-fly. The situation in ERDDAP is different. The source files for the BagIt file don't already exist. Yes, you or I could make it so that the needed files are created and added to the BagIt file directly, but the time and disk space needed for ERDDAP's approach haven't been a problem. It seems like a significant project for minimal payoff, and which would add complexity to ERDDAP (never a good thing). And BagIt creation is a secondary, not-frequently-used feature of ERDDAP. So I don't think it is worth the time and effort.

There are other things with higher priority on the To Do list. I welcome your assistance with these projects. Please see https://github.com/BobSimons/erddap/labels/enhancement

Best wishes.

ThomasThelen commented 2 years ago

your system requires that the files to be included in the BagIt file already exist That's correct- it operates on InputStreams to the bytes

Totally understand that the ERDAP architecture differs in that aspect-I'll be sure to browse around the other issues :)

Cheers,

Thomas