npapernot / phishing-detection

Train a simple decision tree classifier to detect websites used for phishing
MIT License
75 stars 36 forks source link

feature detection #1

Open guihui123 opened 7 years ago

guihui123 commented 7 years ago

Thank you your share. But I found only number information in file data.csv. So if I have a HTML file,how can I get its feature information. Do you write feature detection script? If yes,Can you share it your project.

npapernot commented 7 years ago

I reused the extracted features and labels from the UCI dataset mentioned in the readme. I don't think the dataset came with a feature extractor.

On Fri, May 5, 2017, 10:46 AM guihui123 notifications@github.com wrote:

Thank you your share. But I found only number information in file data.csv. So if I have a HTML file,how can I get its feature information. Do you write feature detection script? If yes,Can you share it your project.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/npapernot/phishing-detection/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AIbS9vqhGX98DqO5J_NqcU6fONgwvapPks5r2uHvgaJpZM4NRpLo .

eric-edem commented 1 year ago

Hi @guihui123 has anyone been able to come up with the feature extraction script?

I dont think this current implemententation can accept raw URLs, then extract the features and make a prediction. Is there any solution to this?