mdcao / japsa

Just Another JAva Package for Sequence Analysis
BSD 3-Clause "New" or "Revised" License
23 stars 14 forks source link

NullPointerException XM compression #35

Open miltondts opened 4 years ago

miltondts commented 4 years ago

I'm trying to compress a DNA sequence file and it throws a nullpointer. Any ideas why?

java -version
openjdk version "1.8.0_242"
OpenJDK Runtime Environment (build 1.8.0_242-b08)
OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)

$ jsa.xm.compress BuEb
The eXpert-Model (XM) for Compression of DNA Sequences V 3.0
  Minh Duc Cao, T. I. Dix, L. Allison, C. Mears.
  A simple statistical algorithm for biological sequence compression
  IEEE Data Compression Conf., Snowbird, Utah, 2007, [doi:10.1109/DCC.2007.7]

Parameters:
Hash size        : 11
Expert Limit     : 200
Context          : 15
Listen Threshold : 0.15bps
Chances          : 20
BinaryHash       : false
HashType         : Hashtable
Expert Type      : 
 #Reading file(s)
Exception in thread "main" java.lang.NullPointerException
    at japsa.tools.bio.xm.ExpertModelCmd.main(ExpertModelCmd.java:130)

It appears the program doesn't support sequence files? Only Fasta, Fastq.. is that correct?

mdcao commented 4 years ago

Thanks for the report. what is the format of your input file?

The program currently supports fasta format.

miltondts commented 4 years ago

The file is just a sequence of letters(A, C, G, T) without any other character (no newlines etc). XM V2 supported this format. I solved it by putting a '>\n' at the start of the file. But now I'm also experiencing data corruption as reported in issue 29.