rdpstaff / classifier

RDP extensible sequence classifier for fungal lsu, bacterial and archaeal 16s
GNU General Public License v2.0
53 stars 32 forks source link

Exception handling #5

Open dbrami opened 9 years ago

dbrami commented 9 years ago

Hi,

When trying to merge the existing results of the classifier I get the following error:

cmd-> java -Xmx16g -jar /bio_bin/rdp_classifier_2.8/dist/classifier.jar merge-detail \

-o merged_classified.txt \ -h merged_classified.hier.txt \ -c 0.5 \ --train_propfile /bioinformatics/bio_db/silva_SSURef_108_tax_silva_trunc/qiime/Silva_108/taxa_mapping/CombinedClassifier/rRNAClassifier.properties \ ./Corn-Root-P1-MP/Corn-Root-P1-MP.16S18S.univec.rdp ./Corn-Root-P1-Mobio/Corn-Root-P1-Mobio.16S18S.univec.rdp Exception in thread "main" java.lang.IllegalArgumentException: taxon Node environmental samples in line "M01224:135:000000000-A9TYB:1:1107:21832:7607 Root norank 1.0 Eukaryota Superkingdom 1.0 Fungi Kingdom 0.99 Dikarya Subkingdom 0.99 Basidiomycota Phylum 0.99 environmental samples Genus 0.75" is not found in the original Classifier training data. at edu.msu.cme.rdp.classifier.rrnaclassifier.ClassificationParser.next(ClassificationParser.java:107) at edu.msu.cme.rdp.multicompare.MultiClassifier.multiClassificationParser(MultiClassifier.java:252) at edu.msu.cme.rdp.multicompare.Reprocess.main(Reprocess.java:184) at edu.msu.cme.rdp.classifier.cli.ClassifierMain.main(ClassifierMain.java:69)

Is there any way the exception handling could be done more gracefully and keep going with the merging, while putting the offending reads in a separate file or just logging that some reads were problematic? BTW, all the reads were classified with same training set that was supplied on the command line and should NOT be generating an error.

rdpstaffmsu commented 9 years ago

We are sorry for the late reply we have been having issues with github notifications, hopefully we can still help! Would you be able to provide us with the training data you used, that would be the easiest way for us to pinpoint the problem?