DPCS-team / DPCS

Data Powered Crash Solver
7 stars 17 forks source link

Research/ticket 68/Paper - data preprocessing #56

Open inexxt opened 8 years ago

inexxt commented 8 years ago

Description of the data preprocessing techniques we plan to use in the project: 1) Normalization of system paths (~home), /opt/bin, /bin/ etc - heuristics 2) Lowercase, 's, timestamps, PII removal (emails, passwords) (library?) 3) Optional translation 4) Stopwords (do we need them?)