Closed okurman closed 7 months ago
Hi @okurman, sorry for the late response. In these cases, Selene will use the REF of the given VCF file but the remainder of the 4096 bp sequence will be retrieved from the FASTA file.
Selene outputs warnings whenever this situation occurs, if you have a chance to review them. Often what we see is ref/alt is swapped in the VCF files (e.g. alt actually matches the GRCh38 assembly) - maybe that is happening?
Dear Selene/Sei developers, thank you for the colossal work you've undertaken.
I have a question regarding the variant effect prediction functionality of Sei model. I am using the model to calculate the variant effects using gnomAD, there seem to be many mismatches between the REFs of the gnomAD variants and the GRCh38 assembly fasta file. So, my question is, in these cases, does Selene use the REF of the given VCF file or does it use the corresponding NT from the GRCh38 fasta file?