Closed connorskees closed 6 years ago
Great! Can you confirm that it's working as intended?
Yes everything works exactly as intended. --exclude . and --exclude http. return 0 urls while --exclude \d and --exclude "" return the same as without the exclude flag.
It's not working on my end.
With no flag:
Flag of .*
Flag of "thisdoesnotexist"
Flag of .*teach.* (note that only one link is removed)
Doesn't work :')
mm do you want to exclude keywords or regexes? with the regex, change to .*?questions
Sorry, my bad I added it back in with Update photon.py (4c60e1b)
It still doesn't work :)
I think the issue is that although it isn't crawling the links, it is still adding them to the links file. Is it ok to just use remove_regex()
on the list before they are exported?
Now testing this against the links file, links containing the regex are not added.
Not to spam you with pull requests, but is this what you were thinking?