claczny / VizBin

Repository of our application for human-augmented binning
27 stars 14 forks source link

Unknonw column header: 67 #36

Closed blancaverag closed 8 years ago

blancaverag commented 8 years ago

Hi!

This error appear when uploading my contigs fasta file and annotation file with label and gc columns. I am attaching the annotation file.

2016-05-10 16:08:38,792 ERROR Thread-0 - Unknonw column header: 67 lcsb.vizbin.service.InvalidMetaFileException: Unknonw column header: 67 at lcsb.vizbin.service.DataSetFactory.createDataSetFromFastaFile(DataSetFactory.java:321) at lcsb.vizbin.service.DataSetFactory.createDataSetFromFastaFile(DataSetFactory.java:212) at lu.uni.lcsb.vizbin.ProcessInput$2.run(ProcessInput.java:134) 2016-05-10 16:08:38,795 DEBUG Thread-0 - Error! Check the logs. annotation_for_VizBin_gc.csv.txt

When running the program without the annotation file, it runs correctly but warning about:

Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK

Help is very wecome!

claczny commented 8 years ago

Hi,

thank you for using VizBin.

The attached file seems to be mal-formatted as it contains only a single line and is missing a line break. In general, the order of the entries in the annotation file must match the order of the sequences that were visualized, with each line (except the header line which is considered "line 0" here) matching the respective sequence, e.g., line 1 in the annotation is for the first sequence, line 2 for the second sequence etc.

So you should check the process that generated your annotation file and make sure it introduces line breaks. However, I tried to fix your file and have attached it.

In general, you should filter your sequences according to length before running them through VizBin, if you want to use annotation information, s. a. Issue #3.

Kindly let me know if this solves your issue.

annotation_for_VizBin_gc.csv.fixed.txt

blancaverag commented 8 years ago

That works, thank you very much for your fast answer!

Regarding the warning (Failed to load... as stated ahead), do you think something important is missing?

claczny commented 8 years ago

Right now, I am not 100% sure whether there is some automatic fallback for this situation. I'd suggest that you let it run and check whether the plot makes sense. If not, please try to change the "PCA library" to "EJML" under "Show additional options" and get back to me with some more information regarding your operating system.

Please note that the "EJML" PCA library is slower than the "MTJ" library, which is why the latter is the default, s.a., the wiki.

Moreover, I saw that your annotation file contains around 100,000 lines hence I expect that you want to visualize as many contigs >= 1000nt? While this is possible using VizBin, the computation will expectedly take a few minutes and will require quite a lot of RAM. Probably using -Xmx4g is a good idea, s. a. the wiki

blancaverag commented 8 years ago

Thank you very much again!!

claczny commented 8 years ago

You're welcome!