Bender250 / eth_knowledge_base

My personal wiki knowledge base for ETH InfoSec
MIT License
1 stars 0 forks source link

Enforcement bots TODO list #1

Open Bender250 opened 4 years ago

Bender250 commented 4 years ago
  1. Implementation

    • [x] Mailserver for generation of unique addresses
    • [x] Fix issue with dropping malformed emails
    • [x] Crawler
    • [x] Language detection + ENG version detection
    • [x] Detection registration forms, terms and conditions, privacy policies (continuous advancement process)
    • [x] Orchestration (user guides the crawler)
    • [x] Run crawler to detect 1000 potential registration forms
    • [x] Algorithm for extraction of registration forms features
    • [ ] Registration form classification (dependent on Training reg. forms dataset collection)
    • [ ] Data pre-processing: cleaning, language features embedding
    • [ ] Modeling
    • [ ] Using the output of classification in the crawler (this connection is more challenging than it seams)
    • [ ] Email classification (dependent on collection of Training reg. forms dataset and Mail labeling)
    • [ ] Features analysis and extraction
    • [ ] Classification
    • [x] Email registration confirmation
    • [x] Finish registration process (e.g., clicking confirmation links, using registration code)
  2. Study

    • [x] Pilot study
    • [x] What aspects are interesting?
    • [ ] Training registration forms dataset collection (depends on Crawler orchestration and Running crawler)
    • [ ] Can we collect 1000 registration forms?
    • [x] Processing corresponding emails
    • [ ] Final study
    • [ ] In ideal case, we can find all types of violations automatically. Then this study analyses a sample to confirm rate of false positives and false negatives
    • [ ] If the automation is not that successful, we have to use the orchestration. Can we do 10k registrations?
  3. Writing

    • [ ] Analyze the following research questions:
    • [x] Are email addresses shared with third parties?
    • [ ] Where do the spammers get the email addresses?
    • [ ] What ratio of services sends unsolicited mail? Are they smaller or larger companies?
    • [ ] What services force user to accept newsletters? Are they smaller or larger companies?
    • [ ] Are the registration forms themselves compliant (pre-accepted T&C/PP)?
    • [ ] Do the "Unsubscribe" links work?
Bender250 commented 4 years ago

3 law papers:

Bender250 commented 4 years ago

Detailed TODO list for the implementation.

Optional