Closed zoe70416 closed 3 months ago
There is a bit of code reuse that we can make more efficient by having the preprocessdns function be in utils, and then having a single train.py
function that takes user input on what they want to train on. For example, we can have a command line argument of log-type-to-train
that takes a comma separated list of the different logs to train on such as conn,ssh
or just http
. Then we can have a if statement in the main control flow of train.py
that checks what arguments are passed, and executes the `preprocess` and train code for that specific log type
working on the infer.py
portion of the addition, need to break out the infer.py
to also be module based, maybe this can be multiprocessed too
Will test on the HSRN node and see what happens, its starting to look really good
Tested, LGTM
Finish processing dns.log and http.log
No error when running these: python train_dns.py --log-dir /usr/local/logs python train_http.py --log-dir /usr/local/logs