Closed hardreddata closed 4 years ago
The issue seems related to your starting path. If your drive is already mapped for the account running the crawler and you are not concerned with document ACLs, you should not need to add CIFS support.
If you want to extract the ACLs and crawl it using SMB/CFIS protocol, you likely need to specify your start path like this:
smb://hostname/pathToMappedDir/RussellG/crawlme
Thanks for your help.
The smb
paths result in the error below.
INFO [AbstractCrawler] Sample Crawler: Crawling references...
INFO [AbstractCrawler] Sample Crawler: 10% completed (1 processed/10 total)
ERROR [SpecificSmbFetcher] Could not retreive SMB ACL data.
jcifs.smb.SmbException: The handle is invalid.
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
at jcifs.smb.SmbSession.send(SmbSession.java:238)
at jcifs.smb.SmbTree.send(SmbTree.java:119)
at jcifs.smb.SmbFile.send(SmbFile.java:775)
at jcifs.smb.SmbFile.close(SmbFile.java:1023)
at jcifs.smb.SmbFile.getSecurity(SmbFile.java:2904)
at jcifs.smb.SmbFile.getSecurity(SmbFile.java:2975)
at com.norconex.collector.fs.fetch.impl.SpecificSmbFetcher.fetchFileSpecificMeta(SpecificSmbFetcher.java:69)
at com.norconex.collector.fs.fetch.impl.GenericFileMetadataFetcher.fetchMetadada(GenericFileMetadataFetcher.java:75)
at com.norconex.collector.fs.pipeline.importer.FileImporterPipeline$FileMetadataFetcherStage.executeStage(FileImporterPipeline.java:153)
at com.norconex.collector.fs.pipeline.importer.AbstractImporterStage.execute(AbstractImporterStage.java:31)
at com.norconex.collector.fs.pipeline.importer.AbstractImporterStage.execute(AbstractImporterStage.java:24)
at com.norconex.commons.lang.pipeline.Pipeline.execute(Pipeline.java:91)
at com.norconex.collector.fs.crawler.FilesystemCrawler.executeImporterPipeline(FilesystemCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextQueuedCrawlData(AbstractCrawler.java:538)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextReference(AbstractCrawler.java:419)
at com.norconex.collector.core.crawler.AbstractCrawler$ProcessReferencesRunnable.run(AbstractCrawler.java:829)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
The only ACL I am really interested in is owner and I can live without it. I think another way forward for me would be if this ACL stuff could be disabled via config.
Any advice is very welcome.
It turns out the first problem you got is due to not being able to extract the ACL when the Windows drive is different than the one the crawler is running on (e.g., C:
vs S:
). I just made a new 2.9.1-SNAPSHOT Filesystem Collector release with a fix for this.
Please give it a try and confirm.
Thanks for the prompt responses.
The fix worked great for path = s:/RussellG/crawlme
.
As you suggested I did not need to add CIFS.
Hi,
Thanks for making this tool available. It is very useful. I have it working when scanning local drives.
I am running the latest 2.9 snapshot and have patched the CIFS .jar per https://github.com/Norconex/collector-filesystem/issues/49
Note that I had to get it from https://mvnrepository.com/artifact/jcifs/jcifs/1.3.17 as the link on the norconex website no longer works ( http://central.maven.org/maven2/jcifs/jcifs/1.3.17/jcifs-1.3.17.jar )
I did explore patching commons-vfs but per https://github.com/Norconex/collector-filesystem/issues/3 but I think this is no longer required?
The config (where domain and password1 are replaced)
With variables
Noting that
s:/
is a mapped network drive.Throws
Any advice is very welcome.