mutalyzer / mutalyzer2

HGVS variant nomenclature checker
98 stars 23 forks source link

Track source for reference files #387

Closed martijnvermaat closed 8 years ago

martijnvermaat commented 8 years ago

Track source for reference files

Previously, the original source for a reference file was implicit:

  1. If accession number starts with LRG_, it came from the LRG FTP archive.
  2. If a download URL is known, it was downloaded from there.
  3. If slice data is known, it was sliced from the NCBI.
  4. If a GI number is known, it was downloaded from the NCBI.
  5. Otherwise, it was uploaded.

In preparation for the removal of GI numbers (#349), this had to be revisited. We now store the source explicitely in a new source field on the Reference model. If additional information is needed to re-fetch the file from this source (e.g., download URL), this is stored in a new source_data field (always serialized as a string). This scheme should be both more explicit and more generic.
