Closed barkait closed 6 years ago
Hi,
Invalid Alphabet
means that the read was discarded because it contained a letter which did not correspond to A
C
G
or T
. In FastQ files this typically corresponds to occurrences of N
.
If your data is artificial and you know for sure no invalid nucleotides occur, there might be a bug in the parser which is assigning the read to this case by mistake.
If you can provide me with a minimal example where you encounter this issue, I will be more than happy to look into it.
Thank you for reporting this!
I can’t log in right now to github but i’ve had the same issue. For me it was due to the files not being UNIX and UTF-8 coded. Might be the same thing?
W dniu pt., 30.03.2018 o 09:16 drivenbyentropy notifications@github.com napisał(a):
Hi,
Invalid Alphabet means that the read was discarded because it contained a letter which did not correspond to A C G or T. In FastQ files this typically corresponds to occurrences of N.
If your data is artificial and you know for sure no invalid nucleotides occur, there might be a bug in the parser which is assigning the read to this case by mistake.
If you can provide me with a minimal example where you encounter this issue, I will be more than happy to look into it.
Thank you for reporting this!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/drivenbyentropy/aptasuite/issues/41#issuecomment-377463946, or mute the thread https://github.com/notifications/unsubscribe-auth/AfUlLXQPmg2LIXy0NBdnoG4biGuVpf90ks5tjdvqgaJpZM4TBXYG .
-- Przemysław Jurek Starszy Specjalista ds. Badań i Rozwoju
tel. +48 796 07 97 24 www.PureBiologics.com http://www.purebiologics.com/
Pure Biologics S.A., ul. Duńska 11, 54-427 Wrocław https://maps.google.com/?q=ul.+Du%C5%84ska+11,+54-427+Wroc%C5%82aw&entry=gmail&source=g Pure Biologics Oddział w Berlinie, Rudower Chaussee 29, 12489 Berlin, Niemcy https://maps.google.com/?q=Berlinie,+Rudower+Chaussee+29,+12489+Berlin,+Niemcy&entry=gmail&source=g
REGON: 021305772 | NIP 894-300-3192 | KRS: 0000712811
as you said, i found some N's in my data, so that must be the reason. thanks!
I have add a description regarding the meaning of the individual parsing statistics to the Wiki.
Thanks again!
Hey,
When i am parsing my data (which is kind of artificial) there is some entries that are classified as "invalid alphabet". Meanwhile i can't share my FASTQ files, but you might elaborate what is exactly "invalid alphabet"?
Best,