Closed ahmadia closed 9 years ago
Closing this issue as it's now resolved.
Aron and Brittain thank you both for the detailed traces and responses. This is real interesting and I hope there is something we can do down in Nutch to mitigate against this. I'm going to reference this thread over on user@nutch so that people have visibility. Lewis
On Tuesday, June 2, 2015, Aron Ahmadia notifications@github.com wrote:
Closing this issue as it's now resolved.
— Reply to this email directly or view it on GitHub https://github.com/memex-explorer/memex-explorer/issues/558#issuecomment-107988641 .
Lewis
By default, Vagrant maps the "source" directory on the host machine to
/vagrant
on the client. This is handy, particular when you want to make local source changes and see how it affects the deployed machine.This can break in situations where the program is running in the local source directory, or when operations on the source directory are sensitive to the file system type.
This is the sort of error Brittain noticed last week. Sample output:
For now, in Memex Explorer, we're fixing this issue by running the crawls now in
/home/vagrant
, which is not mapped, see https://github.com/memex-explorer/memex-explorer/pull/557The issue is fixed for us, but it's exposed an underlying issue in the way Nutch interacts with a "hostile" file system, and the Nutch developers might want to take a look at this to harden the crawler against similar issues in the future.
cc @lewismc @brittainhard @chrismattmann