LHNCBC / metamaplite

A near real-time named-entity recognizer
https://metamap.nlm.nih.gov/MetaMapLite.shtml
Other
58 stars 14 forks source link

`getBasename` selects higher-level directory if it contains dots/periods #35

Open dcronkite opened 1 year ago

dcronkite commented 1 year ago

When processing files without extensions, getBasename will split on a directory name if it contains periods in it:

https://github.com/lhncbc/metamaplite/blob/a7d10264a023afda497356f50faa4385ab7e3908/src/main/java/gov/nih/nlm/nls/ner/MetaMapLite.java#L1003-L1011

E.g., if I supply the path /mapr/r.ds/mml/file_0, it will attempt to write the output json file to /mapr/r.json rather than /mapr/r.ds/mml/file_0.json:

[main] INFO gov.nih.nlm.nls.ner.MetaMapLite - Loading and processing /mapr/r.ds/mml/file_0
[main] INFO gov.nih.nlm.nls.ner.MetaMapLite - outputing results to /mapr/r.json
Exception in thread "main" java.io.FileNotFoundException: /mapr/r.json (Operation not permitted)

Workaround I've used the workaround of requiring all input files to have a .txt extension (e.g., /mapr/r.ds/mml/file_0.txt in the above example). This will put the output in the correct directory (e.g., /mapr/r.ds/mml/file_0.json).