nextstrain / fauna

RethinkDB database to support real-time virus analysis
GNU Affero General Public License v3.0
33 stars 13 forks source link

database dump #59

Closed phiweger closed 6 years ago

phiweger commented 7 years ago

Dear all,

Is it possible to get a copy of a nextstrain database dump (e.g. in JSON)? Or is this problematic for proprietary reasons?

Best and thanks, Adrian

trvrb commented 7 years ago

There are issues with data sharing that, unfortunately, prevent this to some degree. For example, our flu data comes from gisaid.org, where there is a fairly strict requirement that sequences not be further shared. However, depending on your virus of interest, reconstructing the database should be straight forward. We've maintained notes on this here:

phiweger commented 7 years ago

So there are restrictions that apply to all collections, i.e. Ebola, Zika etc.?

Thank you for the links!

trvrb commented 6 years ago

@viehwegerlib ---

Sorry to be slow. We are purposely not rehosting data at the moment. In addition to flu, much of the Ebola and Zika data came from sources that had explicit "pre-publication disclaimers"(like here and here). We're working on a solution here in which sequences could be flagged with something like a license. But in the meantime, I'm sorry but we can't share raw sequence data.