termsuite / termsuite-core

A Java UIMA-based toolbox for multilingual and efficient terminology extraction an multilingual term alignment
Apache License 2.0
38 stars 11 forks source link

Debug command line path portability to windows #100

Closed dcram closed 7 years ago

dcram commented 7 years ago

I try to run TermSuite on an Windows PC. The GUI version ist running fine on a Windows 64bit OS but not on a 32bit one.

The CLI version always runs in an error.

I call the extraction with:

java -Xms256m -Xmx8g -cp termsuite-core-3.0.3.jar fr.univnantes.termsuite.tools.TerminologyExtractorCLI \
  -t tree-tagger \
  -c wind-energy/English/txt/ \
  -l en \
  --tsv wind-energy-en.tsv

And get the message:

i:\TermSuite\workspace>java -Xms256m -Xmx8g -cp termsuite-core-3.0.3.jar fr.univnantes.termsuite.tools.TerminologyExtractorCLI -t tree-tagger -c wind-energy/English/txt/ -l en --tsv wind-energy-en.tsv
Exception in thread "main" fr.univnantes.termsuite.tools.TermSuiteCliException: An unexpected error occurred: Illegal char <:> at index 2: /i:/TermSuite/workspace/wind-energy/English/txt/file-10.txt
        at fr.univnantes.termsuite.tools.CommandLineClient.launch(CommandLineClient.java:295)
        at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.main(TerminologyExtractorCLI.java:196)
Caused by: java.nio.file.InvalidPathException: Illegal char <:> at index 2: /i:/TermSuite/workspace/wind-energy/English/txt/file-10.txt
        at sun.nio.fs.WindowsPathParser.normalize(Unknown Source)
        at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
        at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
        at sun.nio.fs.WindowsPath.parse(Unknown Source)
        at sun.nio.fs.WindowsFileSystem.getPath(Unknown Source)
        at java.nio.file.Paths.get(Unknown Source)
        at fr.univnantes.termsuite.model.FileSystemCorpus.readFileContent(FileSystemCorpus.java:103)
        at fr.univnantes.termsuite.api.TXTCorpus.readDocumentText(TXTCorpus.java:30)
        at fr.univnantes.termsuite.api.Preprocessor.lambda$toIndexedCorpus$0(Preprocessor.java:136)
        at java.util.stream.ReferencePipeline$3$1.accept(Unknown Source)
        at java.util.stream.ReferencePipeline$3$1.accept(Unknown Source)
        at java.util.stream.ReferencePipeline$2$1.accept(Unknown Source)
        at java.util.stream.ReferencePipeline$3$1.accept(Unknown Source)
        at java.util.Iterator.forEachRemaining(Unknown Source)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Unknown Source)
        at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(Unknown Source)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(Unknown Source)
        at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
        at java.util.stream.ReferencePipeline.forEach(Unknown Source)
        at fr.univnantes.termsuite.api.Preprocessor.toIndexedCorpus(Preprocessor.java:211)
        at fr.univnantes.termsuite.api.Preprocessor.toIndexedCorpus(Preprocessor.java:133)
        at fr.univnantes.termsuite.api.Preprocessor.toIndexedCorpus(Preprocessor.java:127)
        at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.getIndexedCorpus(TerminologyExtractorCLI.java:185)
        at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.run(TerminologyExtractorCLI.java:132)
        at fr.univnantes.termsuite.tools.CommandLineClient.launch(CommandLineClient.java:287)
        ... 1 more
jsxs0 commented 4 years ago

Any updates, anyone?