Closed josephhearnshaw closed 3 years ago
The problem is that Lucene doesn't like accessions like "-" or other strings made of punctuation marks (or spaces) only. I've added a better error message, which reports the field and the searched string. This way, the collapser still fails, but at least with more details. I could make it to go on (ignoring the wrong accession), but I think failing is better, cause accessions with such values are 99% originated from some error and it safer to fix them. If '-' is to say 'no accession', the entry shouldn't be there at all.
(the following ones are internal notes for Knetminer developers)
The new code is under (...)/software/ondex-desktop
and requires Java 11 (the master branch requires that now), it's not worth to retrofit this little change into the 3.0 (since the problem is mainly in the data).
I've defined a new launching script in (...)/test/mapping_bug/git_issue_29/launch.sbatch
. @josephhearnshaw have a look for tips on how to improve those script, eg, usage of relative paths.
By the way, I don't think -Xmx
set to the same value of #SBATCH --mem
can work, since the total memory for the submitted process needs room for both the JVM and its heap (and the bytecode, and other areas I don't remember), ie, the limit passed to the JVM has to be smaller, or you risk that SLURM kills the process.
As a general consideration, the old Lucene code in Ondex is a pain, I'm tempted to rewrite plug-ins like the mapper based on simpler hashmaps (it uses Lucene just to search accessions by identity). I'll see if these errors keep happening.
The following error is met:
This is using ondex mini 3.0 release and a certain OXL file as input.
Test workflow and data are in the knetminer share under
test/mapping_bug/git_issue_29/