Closed MarioVilas closed 10 years ago
The fix is trivial, but I wanted to check with you first because you went to a lot of trouble to check the extension when checking the content type was much easier, so I want to know why you did it that way.
You're right. Do it!
El 02/01/2014, a las 19:20, Mario Vilas notifications@github.com escribió:
The fix is trivial, but I wanted to check with you first because you went to a lot of trouble to check the extension when checking the content type was much easier, so I want to know why you did it that way.
— Reply to this email directly or view it on GitHubhttps://github.com/cr0hn/golismero/issues/245#issuecomment-31472249 .
Never mind, now I know what the code does. The file extension check is only run when the content-length header is missing, which sometimes happens with crappy dynamic pages.
Maybe we could let the plugins set a maximum download size somewhere, or even define it in the API itself - the check_download method is only called when the headers arrive.
In any case, it's a false alarm :) so I'm closing the ticket as invalid.
Instead of checking the content-type, it's checking the file extension. This will fail for example for PHP scripts that manage downloadable files - the extension will be .php but the contents returned may be a binary file, for example.
EDIT: forget that. The Spider should not check the content AT ALL - otherwise plugins that process other data types (like images for example) fail to work! The content type check is already done by the API itself, the Spider is not supposed to check that in the first place. There's already a function that converts the HTTP response into the corresponding Information type, and the Spider should only check if this Information is HTML or Text, but nothing else.