timbitz / Whippet.jl

Lightweight and Fast; RNA-seq quantification at the event-level
MIT License
105 stars 21 forks source link

ERROR: LoadError: Cannot encode 78 to BioSequences.DNAAlphabet{2}() #139

Closed mrb20045 closed 1 year ago

mrb20045 commented 1 year ago

Dear Whippet team, I ran the software without any problem for a datasets. But after that I faced the below error for my rest datasets. I tried different codes but no success. Could you please guide me about that.

Activating project at ~/MRB/RSeq_All_v6/bin/AS/Whippet 23.138898 seconds. Loading splice graph index... /media/mrb/09128522523/Data/Meta_Fat-tail/SNP-Editing_Meta_Results_Ram-v1/0_All/DAS/Whippet/index/graph.jls 2.193634 seconds (2.52 M allocations: 323.759 MiB, 22.11% gc time) Processing reads from file... FASTQ_1: /media/mrb/09128522523/Data/Meta_Fat-tail/Data/Raw/2_HAN_Fat-Thin_Data/SRR13744274_Fat-Han-1_1.fastq.gz FASTQ_2: /media/mrb/09128522523/Data/Meta_Fat-tail/Data/Raw/2_HAN_Fat-Thin_Data/SRR13744274_Fat-Han-1_2.fastq.gz ERROR: LoadError: Cannot encode 78 to BioSequences.DNAAlphabet{2}() Stacktrace: [1] error(s::String) @ Base ./error.jl:35 [2] throw_encode_error(A::BioSequences.DNAAlphabet{2}, src::Vector{UInt8}, soff::Int64) @ BioSequences ~/.julia/packages/BioSequences/k4j4J/src/longsequences/copying.jl:216 [3] encode_chunk @ ~/.julia/packages/BioSequences/k4j4J/src/longsequences/copying.jl:228 [inlined] [4] encode_chunks!(dst::BioSequences.LongSequence{BioSequences.DNAAlphabet{2}}, startindex::Int64, src::Vector{UInt8}, soff::Int64, N::Int64) @ BioSequences ~/.julia/packages/BioSequences/k4j4J/src/longsequences/copying.jl:239 [5] copyto!(dst::BioSequences.LongSequence{BioSequences.DNAAlphabet{2}}, doff::Int64, src::Vector{UInt8}, soff::Int64, N::Int64, #unused#::BioSequences.AsciiAlphabet) @ BioSequences ~/.julia/packages/BioSequences/k4j4J/src/longsequences/copying.jl:361 [6] copyto! @ ~/.julia/packages/BioSequences/k4j4J/src/longsequences/copying.jl:292 [inlined] [7] LongSequence @ ~/.julia/packages/BioSequences/k4j4J/src/longsequences/constructors.jl:49 [inlined] [8] BioSequence @ ~/MRB/RSeq_All_v6/bin/AS/Whippet/src/types.jl:74 [inlined] [9] fill!(rec::Whippet.FASTQRecord, offset::Int64) @ Whippet ~/MRB/RSeq_All_v6/bin/AS/Whippet/src/record.jl:14 [10] process_paired_reads!(fwd_parser::FASTX.FASTQ.Reader{TranscodingStreams.NoopStream{BufferedStreams.BufferedInputStream{Libz.Source{:inflate, BufferedStreams.BufferedInputStream{IOStream}}}}}, rev_parser::FASTX.FASTQ.Reader{TranscodingStreams.NoopStream{BufferedStreams.BufferedInputStream{Libz.Source{:inflate, BufferedStreams.BufferedInputStream{IOStream}}}}}, param::AlignParam, lib::GraphLib, quant::GraphLibQuant{SGAlignPaired, DefaultCounter}, multi::MultiMapping{SGAlignPaired, DefaultCounter}, mod::DefaultBiasMod; bufsize::Int64, sam::Bool, qualoffset::Int64) @ Whippet ~/MRB/RSeq_All_v6/bin/AS/Whippet/src/reads.jl:102 [11] macro expansion @ ~/MRB/RSeq_All_v6/bin/AS/Whippet/src/timer.jl:5 [inlined] [12] main() @ Main ~/MRB/RSeq_All_v6/bin/AS/Whippet/bin/whippet-quant.jl:143 [13] top-level scope @ ~/MRB/RSeq_All_v6/bin/AS/Whippet/src/timer.jl:5 in expression starting at /home/mrb/MRB/RSeq_All_v6/bin/AS/Whippet/bin/whippet-quant.jl:185

mrb20045 commented 1 year ago

I fixed the problem. Some reads were contained N bases.

RacconC commented 1 year ago

I fixed the problem. Some reads were contained N bases.

I am now facing the same problem, how can I fix it please?

mrb20045 commented 1 year ago

To eliminate the reads with ambiguous nucleotides, fastq files need to be trimmed. Tools such as trimgalore can perform this task.