cloudviz / agentless-system-crawler

A tool to crawl systems like crawlers for the web
Apache License 2.0
116 stars 44 forks source link

change regex rule #371

Open tatsuhirochiba opened 6 years ago

tatsuhirochiba commented 6 years ago

This PR is for the issue #370 .

Signed-off-by: Tatsuhiro Chiba chiba@jp.ibm.com

nadgowdas commented 6 years ago

@tatsuhirochiba We still need to update the exclude dirs in the code itself ?

tatsuhirochiba commented 6 years ago

@nadgowdas Sorry, it is my mistake. I retried testing, and regex rule generated by fnmatch works fine, so we do not need to change it...

It is not directly related to this PR, but we may require plugin reload feature (e.g. plugin_reload=True in crawler.conf) without restarting crawler daemon and a function to load the exclude dir from file, since we can not change the rule without daemon restart.

nadgowdas commented 6 years ago

@tatsuhirochiba Yes, we need to enable CRUD on exclude list. Sorry, I didn;t get how you proposed we implement that above ? Can you explain it?

nadgowdas commented 6 years ago

@sahilsuneja1 do you have any thoughts ^^ on this ? We need to implement that feature in the next release ?

One way, I think - is to run docker inspect and find out mount map that would give us externally mounted directory inside container, we can add those to exclude list ?

sahilsuneja1 commented 6 years ago

Not fully clear, but @tatsuhirochiba has confirmed this PR is not required, fnmatch worked for him.

nadgowdas commented 6 years ago

@sahilsuneja1 @tatsuhirochiba sorry to piggyback on this issue, but, the real problem we want to solve is-- how to extend excludelist in crawler, thats what I was inferring above.

sahilsuneja1 commented 6 years ago

Hmm, exclude_dirs could be sent at runtime from crawler.conf. This would prevent any direct change in the code. Are you referring to changing exclude_dirs dynamically while the crawler is running, instead of restarting crawler?

sahilsuneja1 commented 6 years ago

@tatsuhirochiba we should close this, right?