Open pieterlukasse opened 2 years ago
For the question: "some genes do not have a value in ensembl_canonical_gene, but do have ensembl transcript ids set for mskcc_canonical_transcript and uniprot_canonical_transcript", for example gene "GAGE5"
ensembl_canonical_gene
doesn't have value because it tries to find a match from ensembl_biomart_geneids by gene_stable_id
or hgnc_symbol
. But ensembl_biomart_geneids doesn't contain this gene so it has no value.mskcc_canonical_transcript
and uniprot_canonical_transcript
has value because it ties to find a match from overrides tables by hgnc_symbol
, which are isoform_overrides_at_mskcc_grch38 and isoform_overrides_uniprot, but the overrides table contains those genes so they are able to find a match and put in a value
Ticket to make sure the issues found and discussed in #58 comment are investigated and fixed. Main suspect is the following script: