airr-community / airr-standards

AIRR Community Data Standards
https://docs.airr-community.org
Creative Commons Attribution 4.0 International
35 stars 23 forks source link

Ensuring MiAIRR is not NCBI specific - RE: CRWG #45

Closed bcorrie closed 6 years ago

bcorrie commented 7 years ago

Hi All,

This is not urgent, but wanted to log it as it is something that we should probably address in coordination with CRWG.

I am pretty sure that we discussed this, but I can't seem to find that discussion in the issues... I think we agreed that the standard should use NCBI examples but not explicitly state that NCBI should always be used...

I raised this issue at the Common Repository Group meeting a fair while ago (https://github.com/airr-community/common-repo-wg/issues/10) about wording that is too NCBI specific (their Recommendation 4:). The group agreed that this was probably not appropriate but that the wording came from the Minimal Standards working group and that the issue should be agreed by Minimal Standards and perhaps provide some alternate wording.

Current wording is: "Recommendation 4: For long-term storage, data and metadata should be deposited in the Sequence Read Archive (SRA)and GenBank, per the recommendations established by the AIRR Minimal Standards Working Group. The AIRR Working Groups should work with SRA/GenBank to customize metadata capture for AIRR data." Note the explicit mention that it should be deposited in SRA/Genbank.

I suggested changing the wording to something like this:

"Recommendation 4: For long-term storage, data and metadata should be deposited in one of the International Nucleotide Sequence Database Collaboration (INSDC) or similar archives such as SRA, Genbank, and ENA, per the recommendations established by the AIRR Minimal Standards Working Group. The AIRR Working Groups should work with the INSDC archives to coordinate the accurate gathering and storage of metadata for AIRR data."

It would be good if we could provide the CRWG some feedback on this so we can close off the issue.

bussec commented 7 years ago

We had a related dicussing several months ago, when we were working on the GenBank specification. The two sentences that we put in the preface were:

"In general, it is RECOMMENDED to deposit section 5 and 6 information in a database compliant with the recommendations of the AIRR common repository working group (WG) [REF]. However, in terms of standard compliance, it is sufficient to deposit the required information in other general-purpose databases, if an AIRR-accepted specification on information mapping exists for the respective database."

bussec commented 6 years ago

CRWG recommendation 4 has now been broadened to cover all INSDC databases (commit c8e751a).