honeynet / cuckooml

CuckooML: Machine Learning for Cuckoo Sandbox
https://honeynet.github.io/cuckooml/
145 stars 47 forks source link

Naive filesystem path concatenation #20

Open So-Cool opened 7 years ago

So-Cool commented 7 years ago

Right now filesystem paths are created through string concatenation. This has to be changed to os.path.join() for robustness:

greninja commented 7 years ago

For line 1213, which is better(checking with os.path.split(path) ):

`if os.path.split(save_location)[1] : `

or if os.path.split(save_location)[1] != '' :

So-Cool commented 7 years ago

If all the remaining lines use os.path.join, we won't need lines 1213 and 1214 as it will be already taken care of by os.path.join.

greninja commented 7 years ago

So basically you want to add the trailing pathname seperator whilst passing argument(alternative_location) on line 166 itself using os.path.join? Because right now. on line 166 we are not explicitly adding the seperator in the end.

So-Cool commented 7 years ago

It's not about path separator in the end, it's all about concatenating file paths in robust way.
When I was writing the code I knew exactly where the path will end with /, so I only appended it when necessary. The whole point of os.path.join is that you don't have to worry about it; os.path.join("abc", "xyz") and os.path.join("abc/", "xyz") both return "abc/xyz". Therefore, it's more universal and robust -- exactly what we need.

greninja commented 7 years ago

I understand the robustness part. But according to the code , in lines 1213 and 1214 we are checking for a trailing '/' and if there isnt we are appending one. So if we remove those lines ,line 166 would look something like this: ml.save_clustering_results(loader, os.path.join(CUCKOO_ROOT, cfg. \ cuckooml.clustering_results_directory,""))

I am referring to line 166 because thats what I got when I traced back the origin of 'alternative_location'.

Is it correct?

So-Cool commented 7 years ago

Sorry but I don't really have time to walk you through the issue; I'm quite busy right now so I can only review the code and comment on it.

For example, in save_json function we implicitly assume that root_dir ends with /

    def save_json(self, root_dir):
        """Save JSON stored in the class to a file."""
        with open(root_dir+self.name, "w") as j_file:
            json.dump(self.report, j_file)

that's why everywhere else we check for / at the end of filesystem path.

greninja commented 7 years ago

Ok @So-Cool .

Actually the line of code which I sent in the earlier comment does it all i.e. there wont be any need to explicitly check for the trailing '/' (lines 1213 and 1214) and save_json function can safely assume and carry out the operation.

So-Cool commented 7 years ago

Great! Before you do a PR please send me a link to you commits and I'll have a look at them.

greninja commented 7 years ago

https://github.com/greninja/cuckooml/commits/filesystem_path_concatenation Link to the commit.

So-Cool commented 7 years ago

@greninja, linked commit does not exist.

greninja commented 7 years ago

https://github.com/greninja/cuckooml/commit/6859d6e9ef0d54740243076ce99b66338838d69c

So-Cool commented 7 years ago

Comments attached to your commit.

greninja commented 7 years ago

Any issues with the commit? @So-Cool