Closed kasparas12 closed 3 years ago
On 26 Apr 2021, at 08:34, Kasparas Taminskas @.***> wrote:
Hello, I am playing with the crawler and starting the crawl by issuing this command:
Everything seems to run fine, crawl is started as background process. Since I am using AWS Spot instances to run crawls, they might be taken out. Volume data is preserved, though, but machine is swapped to another one so the crawl is interrupted. I tried to imitate this by simply killing java program process and issue the upper command without -n option to continue crawl. In crawl logs then I get these errors:
2021-04-26 07:25:21,631 1588 ERROR [main] i.u.d.l.b.f.Frontier - Trying to restore state from snap directory crawl-digital/frontier/snap, but it does not exist or is not a directory 2021-04-26 07:25:21,632 1589 ERROR [Distributor] i.u.d.l.b.f.Distributor - Unexpected exception java.lang.NullPointerException: null at it.unimi.di.law.bubing.frontier.Distributor.run(Distributor.java:134) 2021-04-26 07:25:21,775 1732 ERROR [MessageThread] i.u.d.l.b.f.MessageThread - Unexpected exception java.lang.NullPointerException: null at it.unimi.di.law.bubing.frontier.MessageThread.run(MessageThread.java:54)
So I guess somehow that snap directory is not being created. Anyone knows why this might happen? This does not let me continue the crawl
You have to stop cleanly to get a snapshot of the current state. If you kill the process you cannot restart it.
Ciao,
seba
Hello, I am playing with the crawler and starting the crawl by issuing this command:
Everything seems to run fine, crawl is started as background process. Since I am using AWS Spot instances to run crawls, they might be taken out. Volume data is preserved, though, but machine is swapped to another one so the crawl is interrupted. I tried to imitate this by simply killing java program process and issue the upper command without -n option to continue crawl. In crawl logs then I get these errors:
2021-04-26 07:25:21,631 1588 ERROR [main] i.u.d.l.b.f.Frontier - Trying to restore state from snap directory crawl-digital/frontier/snap, but it does not exist or is not a directory 2021-04-26 07:25:21,632 1589 ERROR [Distributor] i.u.d.l.b.f.Distributor - Unexpected exception java.lang.NullPointerException: null at it.unimi.di.law.bubing.frontier.Distributor.run(Distributor.java:134) 2021-04-26 07:25:21,775 1732 ERROR [MessageThread] i.u.d.l.b.f.MessageThread - Unexpected exception java.lang.NullPointerException: null at it.unimi.di.law.bubing.frontier.MessageThread.run(MessageThread.java:54)
So I guess somehow that snap directory is not being created. Anyone knows why this might happen? This does not let me continue the crawl
Thank you.