IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
881 stars 493 forks source link

Support for Duplication of Data Collections Across Repositories #2025

Closed posixeleni closed 6 years ago

posixeleni commented 9 years ago

Need to investigate and implement technology in 4.0 to allow for the efficient duplication of data collections across repositories (e.g., Data-PASS).

Some options to consider:

pdurbin commented 8 years ago

@pameyer has a plan for duplicating data collections across repositories that involves installing and operating an proposed optional component of Dataverse called the Data Locality Module (DLM). See #3403 for details. For now I'm adding the "SBGrid" label to this issue as well so we can decide if this issue can be closed in favor of the one Pete opened.

pameyer commented 7 years ago

DLM support ( #3403 ) is currently post-v1; I'll defer to the IQSS folks if this should be closed.

pdurbin commented 6 years ago

Closing in favor of #4706 and #3403 which are much more specific and achievable (smaller chunks) than this gigantic issue that doesn't have a clear definition of done. If someone out there wants support for LOCKSS or ResourceSync or some other technology, please open an individual issue for each technology.

4157 was about backing up files from S3, which was delivered in pull request #4271.

Support for pushing data to Archivematica was last discussed at https://github.com/IQSS/dataverse/issues/4283#issuecomment-392078932

3236 was an idea to harvest more than just metadata of files. Harvesting the files themselves.

In a new doc at https://docs.google.com/document/d/1mTaHuulanL3GIbmvu91uabdy5_cMJ5qiKEEAioWPk_0/edit?usp=sharing I'm using the term "replication" rather than "duplication" which is why all this is on my mind.