crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.41k stars 1.62k forks source link

Warn on `if method_that_only_returns_nil` #12365

Open jgaskins opened 2 years ago

jgaskins commented 2 years ago

Feature Request

Given this code:

def method_that_only_returns_nil : Nil
  # literally anything
end

def success!
end

if method_that_only_returns_nil
  success!
end

Checking for truthiness of the return value of method_that_only_returns_nil will never enter the block, and that might be a useful compiler warning.

An example of this I just ran into is that I usually use this type annotation when I'm using a method for its side effects, but I decided to add a return value later and forgot to change/remove the type annotation. A warning here would've saved me significant debugging time. πŸ˜„

A stretch goal here might be to warn any time you explicitly return a non-nil value with a Nil return type.

Blacksmoke16 commented 2 years ago

Feel like this would be something better suited to https://github.com/crystal-ameba/ameba?

HertzDevil commented 2 years ago

IIRC they rejected it because Ameba is strictly limited to syntactic analysis and this requires type inference.

Blacksmoke16 commented 2 years ago

Hmm, couldn't it be inferred just via the syntax by seeing a method has an explicit falsely return type while being used in conditional logic? Otherwise, yea that's fair.

jgaskins commented 2 years ago

Feel like this would be something better suited to https://github.com/crystal-ameba/ameba?

What I see most in linters is "things that are probably harmless but may be difficult to grok", like variable shadowing or useless assignment. If you have this particular construct in your code, though, it seems almost certain to be a bug. I can't imagine a case where it wouldn't be, at least.

And in cases like this, should we need to opt into a third-party tool to tell us we're doing something obviously incorrect, or should we be able to rely on the compiler to tell us? Honest question, I don't have the Best Answerβ„’, though I do have a preference for not having to opt in.

Blacksmoke16 commented 2 years ago

If you have this particular construct in your code, though, it seems almost certain to be a bug

Yea I guess my thinking is this would be the first case of the compiler warning the user about something they've done versus a more language level thing like a deprecation. At least until https://github.com/crystal-lang/crystal/issues/9246, which might warrant some higher level discussion, as you point out, on how (or if) that should exactly work. E.g. should they be any different than the internal compiler warnings themselves. Or what qualifies for the compiler to warn you about as there are lot of things and bugs you can have in your code. I can't imagine it being maintainable to identify and handle all of them.

A related thought is this is exactly why you write tests, as it's able to point out obvious bugs like this without requiring any extra compiler logic.

jgaskins commented 2 years ago

I guess my thinking is this would be the first case of the compiler warning the user about something they've done versus a more language level thing like a deprecation.

This is a good point, especially because I'm suggesting it as a warning. There is precedent for the compiler protecting you from things like this in Crystal, though. One example is when you define finalize on a struct type, it's obvious to the compiler that you're doing something you'll regret later, and it tells you that. That specific case halts compilation, though, so if we follow that precedent, maybe this should be an error rather than a warning? I suppose that might make more sense in this case, too, since you're clearly trying to do something that will never work the way you expect.

A related thought is this is exactly why you write tests, as it's able to point out obvious bugs like this without requiring any extra compiler logic.

This is actually how I noticed the bug. πŸ˜„ A unit test was telling me the return value was nil when it shouldn't have been. nil is a legit return value for this method under certain conditions but no matter what I was returning, even with an explicit return, it still returned nil. There is significant indirection going on in this particular section of the code, though (it's a wrapper around an interface), so while it was clear to me that there was definitely a bug somewhere, it wasn't clear where it was.

asterite commented 2 years ago

Instead of a warning, we could make it an error.

That said, in general in Crystal it's impossible to make these errors or warnings without preventing valid code from compiling.

I actually tried this change just now, and the compiler can't compile itself anymore. The std specs can't compile anymore. And they all have valid code!

Here's a contrived example:

class Foo
  def companion : Nil
    nil
  end

  def do_it
    puts "Foo is doing it!"
  end
end

class Bar
  def companion
    Foo.new
  end

  def do_it
    puts "Bar is doing it!"
  end
end

def logic(entity)
  if entity.companion
    puts "Some special logic if the entity has a companion"
  end

  entity.do_it
end

logic(Bar.new)
logic(Foo.new)

The last line, logic(Foo.new), won't compile. entity.companion for Foo is always nil! But that code is totally valid for Bar.

The main issue is that a single method can be instantiated differently for different types.

I guess to make it work we'd have to check if for all possible instantiations it's always nil. Say it's always nil for Foo and Bar, then you produce an error. But that would prevent the code from being there if some day someone (maybe not you, but a user of your library) adds a type Baz.

yxhuvud commented 2 years ago

I think this would be a bad idea. Consider code like

class B
  def foo
    false
  end
end

class C
  def foo
    rand < 0.5
  end
end

def x(val)
  if val.foo
    puts :hi
  end
end

p x(B.new)
p x(C.new)

With the suggestion the first call to x would warn.

Or in other words: This break duck typing for methods returning falsey values.

Error on explicit return against a declared nil type do sound very reasonable though.

EDIT: Uh, perhaps I should learn to read. This is basically what asterite write above :sweat:

straight-shoota commented 2 years ago

I don't think this is something for the compiler to care about. Neither if nil nor if false are errors. There are multiple ways that could leave to false positives, which is neither good for an error nor a warning. This seems like the kind of thing a linter could help with (that doesn't necessarily mean ameba should support this).

Error on explicit return against a declared nil type do sound very reasonable though.

I agree, that should be a good reason for the compiler to refuse code.

jgaskins commented 2 years ago

Are we talking about the same thing? None of the code examples provided here have an explicit Nil return type, as in the original issue description. To be clear, I'm not talking about type inference β€” I can see that definitely having a problem with false positives. I just mean method signatures that have : Nil at the end. Or does that matter?

asterite commented 2 years ago

@jgaskins I just updated my example with a Nil return type.

jgaskins commented 2 years ago

@asterite Hmm, I just ran a variant of your code example (I added pp typeof(entity) in logic) and it looks like there's a distinction I'm missing. Someone told me a few years ago that a method is compiled only once for all types passed in. I took that to mean that, in your code example, typeof(entity) inside logic would be (Foo | Bar), in which case the return type Union(Foo | Bar)#companion would be Foo | Nil. It's the only explanation I could imagine with that would explain how a method is only compiled once for all types.

However, in both invocations, the compile-time type is the single concrete type that's passed into the logic method with that type signature. Is there a hidden layer of indirection that the compiler adds to give us that single concrete type inside the method or is the method actually being compiled multiple times?

At the very least, this helps explain why this is so much less straightforward than I imagined. πŸ˜•

asterite commented 2 years ago

Methods are compiled differently each time the input types are different.

HertzDevil commented 2 years ago

A method with a Nil return value restriction may still explicitly return nil itself or any other expression whose type is Nil or NoReturn. A simple syntax check suggests that the standard library violates this in two places. The first one is:

https://github.com/crystal-lang/crystal/blob/5de237e052004bca0e5a43e903139558f5368a2a/src/io/buffered.cr#L128-L137

The abstract method IO::Buffered#unbuffered_write has no return value restriction. Only some of the implementations put Nil or NoReturn, and at least one implementation returns an Int32. (This seems to be a consequence of #9469.) The second one is:

https://github.com/crystal-lang/crystal/blob/5de237e052004bca0e5a43e903139558f5368a2a/src/http/server/handlers/websocket_handler.cr#L21-L24

A similar call can be found in HTTP::StaticFileHandler. This calls the following:

https://github.com/crystal-lang/crystal/blob/5de237e052004bca0e5a43e903139558f5368a2a/src/http/server/handler.cr#L26-L32

respond_with_status returns nil, but here next_handler is a HTTP::Handler | Proc(HTTP::Server::Context, Nil). HTTP::Handler#call, like the abstract method above, also has no return value restriction, and this implementation doesn't return nil.

Of course, a purely syntactic linter won't be able to tell the return type of call_next, or even look up the declaration of #call_next in the first place, as otherwise what it does would amount to semantic analysis already. For such a linter it is probably acceptable to reject any return expression that isn't the nil literal.

Also weren't those explicit returns recently discussed somewhere else?

stakach commented 1 year ago

I had a bug caused by something similar to this

state = {key: "value"}
if state == "value"
  # do some stuff
end

Simple mistake that took hours of debugging to work out. LLVM would probably optimise the if statement out completely too - not sure if that is something that could be bubbled up as a warning

straight-shoota commented 1 year ago

@stakach That's #10277

docelic commented 1 year ago

But supposedly the type of problem @stakach is mentioning has been implemented in Ameba (based on the discussion that started in #10277): https://github.com/crystal-ameba/ameba/issues/237

straight-shoota commented 1 year ago

Yes, https://github.com/crystal-ameba/ameba/issues/237 technically goes at that problem, but only in a very reduced, almost negligible scope: It only considers equality operations between two literals. As stated before, ameba does not do semantic code analysis. So in its current state it's simply not able to do anything more elaborate that would require knowing the type of a variable or method call.