Closed AlsoATraveler closed 1 year ago
The same is A*01:01:02, and TGGAGAACGGGAAGGAGACGCTGCAGCGCACGGA, which is TGGAGAACGGGAAGGAGACGCTGCAGCGCACGGG in A_gen.fasta
Hello, I have reviewed the sequence you have suggested and found no issue with it. The sequence in 3.40.0 is the same as the sequence in the latest release, which is correct. The issue that you are having is that you are not correctly splitting the CDS sequence in the A_nuc.fasta file into exons to search for it in the A_gen.fasta, which contains exons and introns.
For example you say that the A_nuc.fasta file ends AAAGTGTGA which does not appear in the A_gen.fasta. That is because this sequence you are searching for covers two exons, the end of exon 7 and exon 8. It would not appear in the A_gen.fasta because of the intron 7 sequence between these two. The exon 7 and exon 8 sequence of A*01:01:02 is:
exon 7: GCAGTGACAGTGCCCAGGGCTCTGATGTGTCTCTCACAGCTTGTAAAG exon 8: TGTGA
The same is true for your second comment which contains sequence crossing exon 3 and 4.
Oh, thanks.
Hello, in the 3.40 version, there is a question about A*01:01:02. I found that the end of the sequence corresponding to A*01:01:02 in the A_nuc.fasta file (that is, the CDS sequence) is AAAGTGTGA, but in A_gen. In the fasta file, there is no corresponding sequence, but AAAGGTGAG. What is the reason?