dieterich-lab / JACUSA2

New version of JACUSA -> 2.0
GNU General Public License v3.0
23 stars 3 forks source link

java.lang.IllegalArgumentException #74

Closed ericmalekos closed 4 months ago

ericmalekos commented 4 months ago

Hi, I'm runnning the following command which works fine for a while before throwing an exception. Any idea what could be causing this or what to do about it?

I used STAR to align RNASeq to the GCA_000001405.15_GRCh38 reference genome, and am using the same genome fasta here. https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/

java -jar ~/bin/JACUSA_v2.0.4.jar \
call-1 \
-R ${GENOME} \
-p ${N} \
-f B \
-r jacusa_RNA.bed \
${BAM}

...
INFO    00:01:29  Thread 9: Working on contig chr2:232762822-232862821
INFO    00:01:29  Thread 7: Working on contig chr2:232862822-232962821
INFO    00:01:29  Thread 8: Working on contig chr2:232962822-233062821
INFO    00:01:29  Thread 6: Working on contig chr2:233066891-233166890
INFO    00:01:29  Thread 8: Working on contig chr2:233166891-233266890
java.lang.IllegalArgumentException: Byte 89 unknown
    at lib.util.Base.valueOf(Base.java:74)
    at lib.data.storage.container.FileReferenceProvider.getReferenceBase(FileReferenceProvider.java:90)
    at lib.data.storage.container.FileReferenceProvider.getReferenceBase(FileReferenceProvider.java:68)
    at lib.data.assembler.DataAssembler.createDefaultDataContainer(DataAssembler.java:50)
    at lib.data.assembler.DataAssembler.assembleData(DataAssembler.java:39)
    at lib.util.ReplicateContainer.getNullDataContainer(ReplicateContainer.java:61)
    at lib.util.ConditionContainer.getNullDataContainer(ConditionContainer.java:42)
    at jacusa.worker.CallWorker.createParallelData(CallWorker.java:41)
    at lib.worker.AbstractWorker.hasNext(AbstractWorker.java:111)
    at lib.worker.AbstractWorker.processReady(AbstractWorker.java:196)
    at lib.worker.AbstractWorker.run(AbstractWorker.java:213)
ericmalekos commented 4 months ago

I see, there were some non-ATCGN bases in the assembly, and maybe lower case characters, I converted lower to uppercase and everything that wasnt ATCGN to N with this line and it worked

awk '/^>/ {print; next} {gsub(/[^ATCGN]/, "N"); print toupper($0)}' GCA_000001405.15_GRCh38_no_alt_analysis_set.fna > upper_GCA_000001405.15_GRCh38_no_alt_analysis_set.fna

piechottam commented 4 months ago

Jacusa2 does not support non ACGTN bases. Lower case should work.