NBISweden / swefreq

Swedish Frequency resource for genomics website
https://swefreq.nbis.se/
GNU General Public License v3.0
8 stars 1 forks source link

Updates to the beacon breakends #567

Closed MalinAhlberg closed 5 years ago

MalinAhlberg commented 5 years ago

Describe the pull request:

Pull request long description:

Small fixes to the sql model and the import script for breakends. Also adds the --add_reversed_mates flag to the importer, to make a BND searchable in the beacon by using any of its two chromosome as the "main" one.

Changes made:

  1. Set default values for 0 allele_num (callCount) and allele_count (variantCount). '' for mate_id.
  2. Skip counting calls when importing mates, even when asked to, since we don't know what to do with these numbers anyway (the dataset usually already has that info stored in the db).
  3. "chromosomeStart" => chromosomeStart (don't enforce capital s)
  4. Add option --add_reversed_mates to importer. Adds one extra row to the db for each BND, representing the same breakend but with its mate encoded as the starting chromosme.

Question:

Should a "loop" like this

1 54720 cluster_216 N N[1:54720[

be kept once or twice in the db?

| chromosome | chromosomestart | chromosomepos | mate | matestart |   matepos   | 
-+------------+-----------------+---------------+------+-----------+-------------+
 | 1          |           54719 | cluster_216   | 1    |     54719 |             |
 | 1          |           54719 |               | 1    |     54719 | cluster_216 |