Closed jhpoelen closed 4 years ago
Appears to be fixed after upgrade to Nomer v0.1.11 - the following command completed without errors:
$ zcat target/taxonCacheNoHeader.tsv.gz | tail -n +2 | cut -f3 | awk -F '\t' '{ print $1 "\t" $1 }' | java -jar target/nomer.jar replace --properties=config/name2id.properties globi-taxon-rank | cut -f1 | java -jar target/nomer.jar replace --properties=config/id2name.properties globi-taxon-rank > target/norm_ranks.tsv
using matcher [globi-taxon-rank]
using matcher [globi-taxon-rank]
Invalid cookie header: "Set-Cookie: WMF-Last-Access=20-May-2020;Path=/;HttpOnly;secure;Expires=Sun, 21 Jun 2020 12:00:00 GMT". Invalid 'expires' attribute: Sun, 21 Jun 2020 12:00:00 GMT
Invalid cookie header: "Set-Cookie: WMF-Last-Access-Global=20-May-2020;Path=/;Domain=.wikidata.org;HttpOnly;secure;Expires=Sun, 21 Jun 2020 12:00:00 GMT". Invalid 'expires' attribute: Sun, 21 Jun 2020 12:00:00 GMT
Invalid cookie header: "Set-Cookie: WMF-Last-Access=20-May-2020;Path=/;HttpOnly;secure;Expires=Sun, 21 Jun 2020 12:00:00 GMT". Invalid 'expires' attribute: Sun, 21 Jun 2020 12:00:00 GMT
Invalid cookie header: "Set-Cookie: WMF-Last-Access-Global=20-May-2020;Path=/;Domain=.wikidata.org;HttpOnly;secure;Expires=Sun, 21 Jun 2020 12:00:00 GMT". Invalid 'expires' attribute: Sun, 21 Jun 2020 12:00:00 GMT
Nomer uses internal index/caches (in
.nomer
directory) to enable fast offline term matching.On using two Nomer instances on a non-indexed system, the exception (
java.io.IOError: java.io.IOException: Wrong index checksum, store was not closed properly and could be corrupted.
) was observed.Root cause was a new index created by the first Nomer instance with un-commited changes was being re-use by a second Nomer instance.
Suggested fix is to explicitly commit changes after created a new index and to prevent overwriting of existing (partial) indexes.