Some rows of the mutation table include alleles with - besides IUPAC bases.
For instance:
The VCF specification does not allow for dashes in the allele strings. Therefore, we cannot create a VCF record for the row without consulting the reference genome sequence and fetching the previous base.
It is not a big deal to use pysam or similar to fetch the base. However, that will require the library user to download a FASTA file. Alternatively, we could do a REST call to fetch the base. However, I am not sure I know of such API (perhaps variant validator starting from HGVS c str?).
The current code will skip creating a VCF record for these rows. The rest of the row, including the HGVS strings, tumor/normal read depths, etc. will be processed.
Some rows of the mutation table include alleles with
-
besides IUPAC bases.For instance:![Image](https://github.com/monarch-initiative/oncoexporter/assets/12170955/9b45fa79-4fb6-4163-88b7-9ec38813d750)
The VCF specification does not allow for dashes in the allele strings. Therefore, we cannot create a VCF record for the row without consulting the reference genome sequence and fetching the previous base.
It is not a big deal to use
pysam
or similar to fetch the base. However, that will require the library user to download a FASTA file. Alternatively, we could do a REST call to fetch the base. However, I am not sure I know of such API (perhaps variant validator starting from HGVS cstr
?).The current code will skip creating a VCF record for these rows. The rest of the row, including the HGVS strings, tumor/normal read depths, etc. will be processed.