csci4950tgt / honeyclient

1 stars 1 forks source link

refactor(*): resolve race conditions #50

Closed williamtdr closed 4 years ago

williamtdr commented 4 years ago

Hey all,

This fixes some errors in the timing and order we were processing pages, which meant that some things (like Yara) didn't work very consistently, and we'd sometimes miss some page content (like images) on the full-page screenshots.

In doing this, I had to make some pretty big refactors to the Yara manager to make it promise-based, and follow the async/await paradigm so the main ticket script knows when it can continue. Their API is pretty dang awful, but it doesn't look like anyone's made a promise-compliant version. Clawed through it with util.promisify. @jin-yng would appreciate your review in case I broke or missed anything. One thing I noticed is that in the current logic, we skip trying to analyze any files if there's compile warnings. Looks like those happen to a lot of sites, not sure if that's intended behavior.

@alejx, I broke safe browsing into its own class to fit the pattern of the rest of ticket.js better. Only thing I changed logic-wise is that the environment variable gets checked on startup, not on first use of the safe browsing code.

I've been pretty sick/tired when working on this, so please check this PR a bit more closely than usual 😅 Stay safe, everyone.

Update: OCR, google safe browsing, and yara now all happen in parallel. We also have logging for how long each operation takes, whether it needs to block future operations or not.

Commits fix(safe-browsing): check for API key on honeyclient startup refactor(safe-browsing): move into its own class fix(yara): wait until all files have been processed to exitt refactor(yara): extract code from callback hell refactor(yara): centralize error handling refactor(yara): move to be more in-line with the other managers feat(async-worker): EventEmitter paradigm for ops which require deferred callbacks chore(yara): remove problematic rule feat(ticket.js): do operations in parallel feat(): add wall clock timers to monitor performance feat(yara): cleanup resources before returning, fix scanner bug fix(tickets): oops, we definitely need to include the screenshots feat(): improve logging

jin-yng commented 4 years ago

I really like the overall simplicity of it much better now. Never liked all those nested loops...

williamtdr commented 4 years ago

Yeah, logging is definitely a bit noisy now - but start/end and wall clock is definitely useful to have. The log on identifiers while loading the page wasn't actually supposed to be kept around - I was debugging some stuff and left that in there when I was writing that. Replaced it with a request log for now since it's nice to show something while we're waiting for the page to load.