CM-Well / CM-Well

CM-Well - a data warehouse for your knowledge graph
Apache License 2.0
177 stars 34 forks source link

dcc doesn't detect infoton with no record in infoton table #759

Open Ivan-Shestakov opened 6 years ago

Ivan-Shestakov commented 6 years ago

To reproduce: 1) Ingest an infoton curl localhost:9000/_in?format=ntriples -d '<http://test/Infoton-deleted-from-infoton-table> <http://qa.test.tr.com/v1.0/ns#TestName> "Delete from infoton table" .' 2) In cqlsh tool delete this infoton rows in infoton table:

~/app/cm-well/app/cas/cur/bin/cqlsh
use data2;
select * from path WHERE path = '/test/Infoton-deleted-from-infoton-table';
DELETE FROM infoton WHERE uuid = 'd33f9d1caeea9f37682d26b63b73ce09';

3) Reset crawler offsets: echo $XTOKEN "X-CM-WELL-TOKEN:eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJyb290IiwiZXhwIjoxNTI5NDA3MTgwODE1LCJyZXYiOjB9.hUJ_v1ULy7i7CY-kUvtMdh6v5N-5ITjvdTKJ8oXECNs" curl localhost:9000/zz/crawlerPosition.0_offset?op=purge -H $XTOKEN {"success":true} curl localhost:9000/zz/crawlerPosition.0.p_offset?op=purge -H $XTOKEN {"success":true} 4) Restart bg and observe crawler log.

Expected result:

Error message about missing data from infoton table.

Actual result:

No error

Ivan-Shestakov commented 6 years ago

Note - failed to reproduce on PE environment, but on multinode environment it happens. Later edit: I compared imp.X_offset to crawlerPosition.X_offset - and see I didn't give the crawler enough time to finish processing all the work it has. Will update here when it finishes.