Closed wkiri closed 3 years ago
@stevenlujpl I went hunting and found another lpsc_parser.py
process running on mlia-compute
with my username (!). I can't find where it was running from (it was not in any active screens), but I terminated the process. I suspect this was causing the failures above. I will re-run and let you know if I see any further issues.
@stevenlujpl This run completed with no errors. I suspect that the other process caused the errors above, so please do not worry about spending time on them. I think that the open()
call must be creating an (empty) input file for jSRE so we do not see an error even if records
is empty. I think it would be nice to skip the call to jSRE in that case, but it is not a critical change at this time. If you agree, feel free to close this issue. If however it is an easy addition to skip jSRE for empty records
, I think it is a nice update and might shorten runtime a bit.
I confirmed that io.open()
call will create an empty file when the records
variable is empty, and I also added the check to skip jSRE call if there is no target-element and target-mineral record.
Excellent!
This run caused 17 jSRE "no input file" errors when run on the full set of 1303 MER-A documents:
[2021-09-01 11:00:07]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2004_2167.pdf [2021-09-01 11:00:25]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2004_2172.pdf [2021-09-01 11:01:09]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2004_2184.pdf [2021-09-01 11:01:38]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2004_2189.pdf [2021-09-01 11:03:36]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2005_1244.pdf [2021-09-01 11:04:15]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2005_1337.pdf [2021-09-01 11:05:11]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2005_1455.pdf [2021-09-01 11:14:08]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2009_1978.pdf [2021-09-01 11:16:23]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2010_2013.pdf [2021-09-01 11:23:02]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2013_1265.pdf [2021-09-01 11:23:51]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2013_1674.pdf [2021-09-01 11:26:04]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2014_1518.pdf [2021-09-01 11:26:13]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2014_1590.pdf [2021-09-01 11:33:18]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2018_1895.pdf [2021-09-01 11:33:31]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2018_2286.pdf [2021-09-01 11:34:57]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2019_2322.pdf [2021-09-01 11:36:53]: LPSC parser failed: /proj/mte/data/corpus-lpsc/mer-pdf/2020_2783.pdf