dadoonet / fscrawler

Elasticsearch File System Crawler (FS Crawler)
https://fscrawler.readthedocs.io/
Apache License 2.0
1.36k stars 299 forks source link

Auto detect file when it's moved into watched directory #1230

Open helsonxiao opened 3 years ago

helsonxiao commented 3 years ago

Is your feature request related to a problem? Please describe.

We're building a crawler cluster for local area network. It intends to provide a convenient search service. People in there are using FTP or Windows Share Folder and they don't know how to trigger a file indexing procedure. I've checked the documentation and it said only if existing files are touched after they're moved in, then they can be indexed. Based on it, I think our service will be not convenient after some days or weeks.

Describe the solution you'd like

I've checked the core logic about crawler. There is a FsCrawlerManagementService which records all file directories. Also it provides a nice function called getFileDirectory. I think it's possible and very easy to do what I want. Please see this PR in our fork repo.

https://github.com/waterstone-company/fscrawler/pull/34

dadoonet commented 3 years ago

I'm adding here some comments about this feature request as we discussed in https://github.com/dadoonet/fscrawler/discussions/1249