Closed pashashocky closed 8 years ago
@pashashocky I just finishing the same :D
You realize that upstream also accepts pull requests, right? :-P
Not to hijack the thread, but did you report any bugs / PRs?
@jbremer from my part it will be there but later, once it will be tested under hard load of 3 servers, btw i will drop you an email @pashashocky btw i think it could be good start sharing code and start discuss it
my code does the same as your in few words, but im returning the mongo compressed report and store it in master mongo, with saving task id originally and return it when you submit tasks, so dist db is just used as proxy, but is in few words description
@jbremer I actually have been getting like 3-5 different error types, from different places. I can come on irssi to talk about some of them to you, but it was like from the process.py and from scheduler.py and some other stuff. They would cause a part of the slave node to stop working, and that node would be KIA.
Regarding PR's, there are still a few things i would like to implement although maybe could do a PR for you. I slightly modified the db structure to have a link between the task id on the slave and its corresponding web id, additionally I pull most of the analysis data from the slave like ~300mb and using a modified processing module insert it into elastic/mongo for it to show up on the web ui...
Well, process.py
is known to be buggy, process2.py
is already much better and cuckoo process
from the upcoming Cuckoo package is even slightly better in the sense that it also supports non-PostgreSQL (https://github.com/cuckoosandbox/cuckoo/pull/863).
Bugs from scheduler.py
I'd be happy to hear about - aside from known issues with analysis tags
it should work mostly fine.
Regarding the Distributed Cuckoo, that makes sense yeah. Only problem is finding the solution to where the data should be stored and pushed, from there on making the required adjustments is not too difficult - PRs welcome of course.
@doomedraven Hi! I am working with @pashashocky. Is there any repository where you have your code so we can see it? Let's work together, maybe we can speed up the process and get distributed to cuckoo-distributed :)
@xdanx agree :), do you also guys in cuckoo IRC?, just to exchange emails nop is not published in any places, but i will push it to https://github.com/doomedraven/cuckoo-modified in a moment
we're active on cuckoo IRC as well. Let's work on your fork then and hopefully we can generate a PR which will be accepted
published here, https://github.com/doomedraven/cuckoo-modified/commit/0cf1c049029f7f4f3ecd3ebca90e294c9adeeee3
i would speak with you guys to see if we can improve more things, for start is just test, and last step which i wanted to do is just return the id from current db not proxy, and when someone does req for report to dist api, and return correct, but part of getting mongo report and store it from slaves works fine, also from master
going to search you in irc
@xdanx i dont' see any user with your id in cuckoo channel, im use the same username there, can you ping me?
@spender-sandbox could you check this https://github.com/spender-sandbox/cuckoo-modified/compare/master...doomedraven:distributed?expand=1
and told us what do you think/suggestion/ideas? @pashashocky, @xdanx and I, we put a bit of love on that, and it works very good, we still working in last step, to retrieve all files(pcap, memdump)
In few words about how it works. 1) there must be master node, which will get back all files and data 2) when you push task, and task is goes to slave, it make reservation of task id in main_db, and once the analysis is finished on slave, and data retrieved(mongo report(must be executed in slave to avoid reprocessing and that 16mb limit), also behaviour, report.json and screenshots) that allow insert data to main db and mongo and draw report in master webgui, it also insert link to original analysis, but it will be removed once we solve the moving problem(we leave it in this way right now just to make easier for users retrieve pcap/memdump), but it will removed, that also will move binary and create symlink to make correct download from webgui/api 3) there also suport for htaccess for api.py we know what is "depricated" but is to avoid more load for django and leave for users /api/ with limits and use api.py more as admins with basic auth if required 4) upload task with tags works also :) 5) im sure i forgot some nice stuff but you can check it on that link, we tried make minimalistic changes to existing files, less dist.py
Main discussion thread: https://github.com/spender-sandbox/cuckoo-modified/pull/229
Hello Brad,
I have been using cuckoo modified for a while, and it does a lot of what I want, although I was tempted to try 2.0 due to them beginning work on distributed cuckoo among nodes.
I was able to write my own code to integrate several nodes that would report back to the leader and the reports would be added to the web interface. Unfortunately 2.0 is just not stable enough for me and has a lot of bugs, that cause crashes and prevent distributed nodes from reporting smoothly.
Now I am looking to implement distributed cuckoo into cuckoo-modified, wondering if you have any extra ideas, and whether you would be interested in a PR. Maybe you could give me an email so that we could talk there?
Kind Regards, Pash