PROxZIMA / DarkSpider

Anatomy and Visualization of the Network structure of the Dark web using multi-threaded crawler
https://proxzima.dev/DarkSpider/
GNU General Public License v3.0
37 stars 7 forks source link

URL classification as illicit or not #33

Open PROxZIMA opened 1 year ago

PROxZIMA commented 1 year ago

Is your feature request related to a problem? Please describe. The Ultimate aim of the project is to detect illicit websites. As of now the algorithm uses graph knowledge to target suspicious links. Advance techniques are required to accurately classify links and reduce the computational complexity.

Describe the solution you'd like Text-based classification using NLP which transforms the crawler into Context Focused Crawler from the traditional Naive-Best First Crawler. This will further help in crawling at greater depths.

Describe alternatives you've considered Classification technique is yet to be decided.

PROxZIMA commented 1 year ago

@r0nl ideas?