Open tatsuhirochiba opened 6 years ago
@tatsuhirochiba We still need to update the exclude dirs in the code itself ?
@nadgowdas Sorry, it is my mistake. I retried testing, and regex rule generated by fnmatch
works fine, so we do not need to change it...
It is not directly related to this PR, but we may require plugin reload feature (e.g. plugin_reload=True in crawler.conf
) without restarting crawler daemon and a function to load the exclude dir from file,
since we can not change the rule without daemon restart.
@tatsuhirochiba Yes, we need to enable CRUD on exclude list. Sorry, I didn;t get how you proposed we implement that above ? Can you explain it?
@sahilsuneja1 do you have any thoughts ^^ on this ? We need to implement that feature in the next release ?
One way, I think - is to run docker inspect
and find out mount map that would give us externally mounted directory inside container, we can add those to exclude list ?
Not fully clear, but @tatsuhirochiba has confirmed this PR is not required, fnmatch worked for him.
@sahilsuneja1 @tatsuhirochiba sorry to piggyback on this issue, but, the real problem we want to solve is-- how to extend excludelist
in crawler, thats what I was inferring above.
Hmm, exclude_dirs could be sent at runtime from crawler.conf. This would prevent any direct change in the code. Are you referring to changing exclude_dirs dynamically while the crawler is running, instead of restarting crawler?
@tatsuhirochiba we should close this, right?
This PR is for the issue #370 .
Signed-off-by: Tatsuhiro Chiba chiba@jp.ibm.com