bcgsc / RNA-Bloom

:hibiscus: reference-free transcriptome assembly for short and long reads
Other
93 stars 7 forks source link

FastqReader Error? #1

Closed Atalasia closed 5 years ago

Atalasia commented 5 years ago

I am trying to run RNA-Bloom for Nanopore cDNA sequencing reads, but it seems like it is failing....

Below is the message I am getting if I run the program with an input fastq (uncompressed) file.

ERROR: Unsupported file format detected in input file `sample.fastq`. Only FASTA and FASTQ formats are supported.
rnabloom.io.FileFormatException: Unsupported file format detected in input file `sample.fastq`. Only FASTA and FASTQ formats are supported.
        at rnabloom.RNABloom.checkInputFileFormat(RNABloom.java:309)
        at rnabloom.RNABloom.main(RNABloom.java:4750)

Below is the message I am getting if I run the program with an input bgzip compressed fastq file. It runs for some time and then it dies. There's bunch of intermediary files.

...
Parsed 4,308,866 sequences.
        Corrected: 4,308,203(99.98461%)
        Discarded: 663(0.015386879%)
Reads corrected in 54m 50s
Clustering long reads for "rnabloom"
ERROR: null
java.lang.NullPointerException
        at rnabloom.io.FastaReader.<init>(FastaReader.java:44)
        at rnabloom.RNABloom.clusterLongReads(RNABloom.java:2210)
        at rnabloom.RNABloom.clusterLongReads(RNABloom.java:3740)
        at rnabloom.RNABloom.main(RNABloom.java:5150)

Below are the args : args: [-ntcard, -c, 3, -k, 17, -indel, 10, -e, 3, -p, 0.8, -long, sample.fastq, -t, 16, -outdir, .]

Below is the java version :

java version "11.0.2" 2019-01-15 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.2+9-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.2+9-LTS, mixed mode)

I've tried with different cDNA samples (all Nanopore) but they all give the same error.

kmnip commented 5 years ago

Can you please report the first 4 lines of your sample.fastq?

Atalasia commented 5 years ago

@e96d8a78-3d17-4496-9fd3-318a5be25b99 runid=9b8ab3b5f7d423c1d3d2674cba21a76cd74bd205 sampleid=PM-PU-1027-T-A1 read=6 ch=356 start_time=2018-09-27T04:29:03Z
AAAATTATTAGTTCCAGTTCAGGTGTTTATGATTCTTCTTTTTCTTTTGGATGCTTGGCATTTTAATCGGCGATAAAAGAACAAAGATTGCCAGCAGGAACGACAAAGAACAGAACCAAAGACTGCCTGCGCCTTCAGCGAGACCAGGAACTGTGGTGATGCCAGACAGCTGCCAGCGGCTATCTTCCATGGAATAACGTGCGTCATCAACCTGATATGCAACACCTTCCGCTCACCCGCCACTCCTACATGCCCTGCCAGCGATATCGGGC
+
"$%#&$"%$"#)))('(%%$#$$$"'*%%&%"$###)$*,+++",,)'$*&)%"$"&%&%+.0)(%&'*+-'$&$'(($(*((+&##+%$(%#$'#'(%*&$$#(,)'&,'+*)/-''%&$*(*'('%%'$(%%$%"&+%%$&,(%'&((++(*&&)'%%$(%"$$($$$$"''()((&(#""'(++)$)$&#%#&,-*,*-,+)#%'&&+$%$&(%%"$$#%"(%#$)%"$"%"##$"%'"%#)(#$'&'),.(&&*+%#&$&*%%&$##"

EDIT: The name of the file originally was PM-PU-1027-T-A1.fastq rather than sample.fastq. I've posted here differently because I didn't think it would be relevant.

kmnip commented 5 years ago

Weird. Your snippet of the FASTQ file looks identical to the ones I have been assembling. I am in contact with a user who also assembled from FASTQ files and she was able to run RNA-Bloom to completion without any issues with the read parsers.

What is your operating system?

If you are okay with sharing your data, I can try a larger snippet (~100 lines) of your read files. You can send me a link of the read file(s) to my email: kmnip@bcgsc.ca

Atalasia commented 5 years ago

Just as a sanity check, I've tried running RNAbloom using the older java version as specified in the README. So far, the program is running well.

Below is the version I am using :

openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

I'll give an update when the program finishes.

kmnip commented 5 years ago

Thanks for the sanity check! The code was compiled with Java 8, but I tested with Java 10 before. I actually expect a different part of the code would break in Java 11...

kmnip commented 5 years ago

I am able to replicate the java.lang.NullPointerException when I ran RNA-Bloom with OpenJDK 11.

kmnip commented 5 years ago

I have figured out what the issue is! I will make a new release (which contains a fix for this) within this week.

kmnip commented 5 years ago

This bug is fixed in release v1.1.1. Thanks for reporting this bug!

Atalasia commented 5 years ago

Right. I was waiting for the program to finish, but otherwise I think it's running fine (I think I have to re run with the increased heap size). Cheers.