Norconex / jef

Job Execution Framework.
Apache License 2.0
4 stars 5 forks source link

java.io.IOException on .index file. How to avoid? #14

Closed LeMoussel closed 4 years ago

LeMoussel commented 4 years ago

Hi!,

under Windows, I often have java.io.IOException on the .index file.

ERROR - FileJobStatusStore - Cannot read file: D:\Developpement\Java\Norconex HTTP Collector\2.9.0.\output\progress\latest\status\www.example.com__www.example.com.job ERROR - JEFMonInstance - Cannot sync suite statuses for index file: D:\Developpement\Java\Norconex HTTP Collector\2.9.0\output\progress\latest\www.example.com.index java.io.IOException: Le processus ne peut pas accèder au fichier car un autre processus en a verrouillée une partie at java.io.RandomAccessFile.read0(Native Method) at java.io.RandomAccessFile.read(Unknown Source) at java.io.RandomAccessFile.readUnsignedShort(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at java.io.RandomAccessFile.readUTF(Unknown Source) at com.norconex.jef4.status.FileJobStatusStore.read(FileJobStatusStore.java:215) at com.norconex.jef4.status.JobSuiteStatusSnapshot.loadTreeNode(JobSuiteStatusSnapshot.java:206) at com.norconex.jef4.status.JobSuiteStatusSnapshot.newSnapshot(JobSuiteStatusSnapshot.java:193) at com.norconex.jefmon.instance.JEFMonInstance$Monitor.syncIndexFiles(JEFMonInstance.java:143) at com.norconex.jefmon.instance.JEFMonInstance$Monitor.run(JEFMonInstance.java:112) at java.lang.Thread.run(Unknown Source)

How to avoid displaying these exceptions? (Rem: I use jef-monitor.bat )

essiembre commented 4 years ago

Hard to say as I cannot reproduce. See maybe with Microsoft Process Explorer what other process has a handle on the file when that happens. Maybe your antivirus, Microsoft Search Indexer, etc. It could also be that the crawler and JEF Monitor are both competing for the same file but that's normally not an issue.

If you just can't find out, maybe a (super ugly) solution would be to periodically copy with a Scheduled Task your progress files somewhere else, where only JEF Monitor will access. And see if that resolves your issue.

LeMoussel commented 4 years ago

Hi!

it would be interesting to have a tag in the jefconfig.xml file to configure a solution to periodically copy progress files to a directory, where only JEF Monitor will have access.

What do you think?

essiembre commented 4 years ago

Sure, I marked it as a feature request. Did it work for you though?

LeMoussel commented 4 years ago

As you say, I do with a Scheduled Task,. Progress files are copied periodically, Unless I'm mistaken, I couldn't find any documentation on jefconfig.xml file.

essiembre commented 4 years ago

No, it is not a feature of JEF, but rather a Windows workaround. I am glad it solves your problem for now.

LeMoussel commented 4 years ago

I misspoke.. I was asking if there is documentation that explain configuration tags in jefconfig.xml file.

essiembre commented 4 years ago

I am afraid not, but there is only setup.properties that is meant to be changed manually if default values are not OK with you (like to change the default port).

The other config is managed by the application itself.

LeMoussel commented 4 years ago

Ok. Thanks.