Consistently check return value from fseek()/lseek(), such as by using a wrapper

ihsinme commented 3 years ago

I have tracked 13 places where you are not using this wrapper. https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/zip2john.c#L160 I think it is quite simple and safer than a direct call to fseek

magnumripper commented 3 years ago

I presume you mean we should move it to some shared location and use it all through JtR?

ihsinme commented 3 years ago

sorry i don't understand what JtR is. I just noticed that you have a wrapper function with fseek error checking, but there are also simple fseek calls. I think it will be better if all calls are processed correctly.

solardiz commented 3 years ago

We have many calls to fseek and lseek. Some of those (I think including all that come from core) check/use the return value. However, many don't.

Now that I grep for this, it looks like in case of lseek we currently always check/use its return value when we call it directly, except for its use in the jtr_fseek64 macro. In uses of that macro, we currently usually ignore the return value.

ihsinme commented 3 years ago

indeed, you have a lot of places with checks. but I paid attention to these 13 places.

https://github.com/openwall/john/blob/8585c3f605521478c629d4a87ba4b67c34038a3e/src/base64_convert.c#L1086 https://github.com/openwall/john/blob/8585c3f605521478c629d4a87ba4b67c34038a3e/src/base64_convert.c#L1088

https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/keepass2john.c#L260 https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/keepass2john.c#L340 https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/keepass2john.c#L342

https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/loader.c#L1914 https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/loader.c#L1917

https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/opencl_common.c#L2174 https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/opencl_common.c#L2176

https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/pp.c#L1519 https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/pp.c#L1522

https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/wpapcap2john.c#L228 https://github.com/openwall/john/blob/6dca7c39c56fbf7463de2ef6690ec557b2b51d6e/src/wpapcap2john.c#L230

ihsinme commented 3 years ago

friendly ping

solardiz commented 3 years ago

@ihsinme Thanks. What was your reason to report this to us, and to ping us about it now? I see no reason to single this issue out, nor to prioritize it. As you can see, we have 400+ open issues here, and overall code quality of "jumbo" isn't high - historically, "jumbo" was about easily (and almost as-is) accepting contributions that didn't make it into (what we later started calling) "core". If you want to contribute a PR fixing this issue, please go ahead. Otherwise I wouldn't expect it to be worked on soon, if at all.

solardiz commented 3 years ago

@ihsinme I took a look at other issues you reported to other projects at about the same time. Is this some kind of experiment with static code analysis or with contributing to projects or something else? Please let us know, and maybe we can also consider the bigger picture (of what you're trying to achieve and why and whether/how to do it best) rather than focus solely on this one issue. Thanks!

ihsinme commented 3 years ago

@solardiz. Thank you for your attention. I'm really looking for bugs that developers don't like to fix. and really I do it within the framework of static code analysis.

maybe I don't quite understand what you want to say. but fixing your code is based on your coding principles, not my behavior. if you still don't have enough information to make decisions, let me know.

Thanks.

solardiz commented 3 years ago

fixing your code is based on your coding principles, not my behavior.

I fully agree, and we'll proceed with the code accordingly. For some of our other projects with higher code quality, like passwdqc or even JtR core, fixing an issue like this would be high priority (but I think those don't have that issue, precisely due to their higher code quality). For JtR jumbo, it unfortunately currently isn't. But of course we should ideally fix it anyway.

However, just like you're (hopefully) trying to help our project by reporting this issue, I often try to help with others' volunteer work, and I'm interested in static code analysis and when/whether/how to apply it best. So if you're currently in a related research project or are just trying to help the various Open Source projects, I'm happy to share my perspective on these things. Which currently is: you seem to have identified this "missing use of existing wrapper function" pattern and implemented its detection (or do you use a third-party tool for that?), which is a very reasonable part of static code analysis in general, but effectiveness of reporting such issues to arbitrary projects will vary.

For most projects, they can see plenty of issues "like this" and have to wade through them manually by running a static code analyzer on their own. For example, for passwdqc we did run (an older version of) it through Coverity and make adjustments (yet Coverity missed the worst bug there was at the time - a dangling pointer dereference crossing a translation unit boundary). For JtR jumbo, this just wouldn't make sense yet - we know there are plenty of issues that static code analysis would detect. We can find many with bare eyes or grep as well. We're slowly improving jumbo's code quality here and there, but adding static code analysis beyond compiler warnings would only make sense to us once we're done with lower hanging fruit. Also, in terms of practical relevance, I think fuzzing through mangled JtR input files should be a higher priority. In fact, that's why we already have ability to build with ASan, and have added a primitive yet program specifics aware built-in fuzzer, which we should probably make more use of.

Of course, I could have spent my time fixing this one issue (easy to fix, harder to test for possible regressions - by the way, ideally we'd have *2john tools tested in CI, which we currently don't) instead of writing this comment - but I hope sharing my view on the bigger picture could help you focus your work in this area, and might also help other (prospective) contributors be more productive.

solardiz commented 3 years ago

This is a valid issue that should ideally be resolved, so now that we have it opened anyway I see no reason to just close it (unless we were to decide against ever fixing it and would label it WONTFIX, which we didn't). Reopening.

claudioandre-br commented 3 years ago

ideally we'd have *2john tools tested in CI, which we currently don't

We have something, but there are limitations:

computing power (we saw in a commit today that the new "whatever platform" CircleCI is running is slower than it used to be);
we can't wait CI forever; we need something fast (and "cheap");
format/2john authors need (should be a requirement) to create a "standard" script related to each 2john tool so that CI can discover and run it.

Off top of my head, we have good zip tests, including the 2john stuff, and it was automated.

The issue and all the above are John the Ripper 2.0 love hints?

solardiz commented 3 years ago

@claudioandre-br It wouldn't take much computing power to add testing of *2john tools against their known output (what they currently output for our samples). Compared to the kind of testing we already do in CI, this would almost not be measurable. (If we were to test actual cracking of the resulting "hashes", that would be more significant, but this is unnecessary to test merely the *2john tools themselves for regressions.)

openwall / john

Consistently check return value from fseek()/lseek(), such as by using a wrapper #4810