alexandru / scala-best-practices

A collection of Scala best practices
4.39k stars 623 forks source link

Add rule 4.12 #13

Closed leothekim closed 9 years ago

leothekim commented 10 years ago

Saw this on Hacker News. This is a great collection of rules! I added a rule about the behavior of lazy vals declared in functions, which has bit us in the butt a while back. Though declaring lazy vals in functions is allowed by scala, the behavior we assumed was that the lazy val would use a locally scoped lock. Instead, it uses an object-instance lock. Not only is using a lazy val in a function a little smelly, but it can result in an unexpected concurrency bottleneck.

alexandru commented 10 years ago

Thanks @leothekim but I'm not familiar with this behavior and will have to doublecheck. Will be back.

leothekim commented 10 years ago

Cool. Maybe this trivial example can help?

class LazyTest {
  def foo = {
    lazy val bar: Int = 42
    bar
  }
}

javap on the compiled code produces this:

Compiled from "Lazy.scala"
public class LazyTest {
  public int foo();
    descriptor: ()I
    Code:
       0: invokestatic  #16                 // Method scala/runtime/IntRef.zero:()Lscala/runtime/IntRef;
       3: astore_1
       4: iconst_0
       5: invokestatic  #22                 // Method scala/runtime/VolatileByteRef.create:(B)Lscala/runtime/VolatileByteRef;
       8: astore_2
       9: aload_0
      10: aload_1
      11: aload_2
      12: invokespecial #26                 // Method bar$1:(Lscala/runtime/IntRef;Lscala/runtime/VolatileByteRef;)I
      15: ireturn
    LineNumberTable:
      line 3: 3
      line 2: 4
      line 4: 9
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0      16     0  this   LLazyTest;
          4      11     1 bar$lzy   Lscala/runtime/IntRef;
          9       6     2 bitmap$0   Lscala/runtime/VolatileByteRef;

  private final int bar$lzycompute$1(scala.runtime.IntRef, scala.runtime.VolatileByteRef);
    descriptor: (Lscala/runtime/IntRef;Lscala/runtime/VolatileByteRef;)I
    Code:
       0: aload_0
       1: dup
       2: astore_3
       3: monitorenter
       4: aload_2
       5: getfield      #37                 // Field scala/runtime/VolatileByteRef.elem:B
       8: iconst_1
       9: iand
      10: i2b
      11: iconst_0
      12: if_icmpne     32
      15: aload_1
      16: bipush        42
      18: putfield      #40                 // Field scala/runtime/IntRef.elem:I
      21: aload_2
      22: aload_2
      23: getfield      #37                 // Field scala/runtime/VolatileByteRef.elem:B
      26: iconst_1
      27: ior
      28: i2b
      29: putfield      #37                 // Field scala/runtime/VolatileByteRef.elem:B
      32: getstatic     #46                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      35: pop
      36: aload_3
      37: monitorexit
      38: aload_1
      39: getfield      #40                 // Field scala/runtime/IntRef.elem:I
      42: ireturn
      43: aload_3
      44: monitorexit
      45: athrow
    Exception table:
       from    to  target type
           4    38    43   any
               LineNumberTable:
      line 3: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0      46     0  this   LLazyTest;
          0      46     1 bar$lzy$1   Lscala/runtime/IntRef;
          0      46     2 bitmap$0$1   Lscala/runtime/VolatileByteRef;

  private final int bar$1(scala.runtime.IntRef, scala.runtime.VolatileByteRef);
    descriptor: (Lscala/runtime/IntRef;Lscala/runtime/VolatileByteRef;)I
    Code:
       0: aload_2
       1: getfield      #37                 // Field scala/runtime/VolatileByteRef.elem:B
       4: iconst_1
       5: iand
       6: i2b
       7: iconst_0
       8: if_icmpne     20
      11: aload_0
      12: aload_1
      13: aload_2
      14: invokespecial #52                 // Method bar$lzycompute$1:(Lscala/runtime/IntRef;Lscala/runtime/VolatileByteRef;)I
      17: goto          24
      20: aload_1
      21: getfield      #40                 // Field scala/runtime/IntRef.elem:I
      24: ireturn
    LineNumberTable:
      line 3: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0      25     0  this   LLazyTest;
          0      25     1 bar$lzy$1   Lscala/runtime/IntRef;
          0      25     2 bitmap$0$1   Lscala/runtime/VolatileByteRef;

  public LazyTest();
    descriptor: ()V
    Code:
       0: aload_0
       1: invokespecial #56                 // Method java/lang/Object."<init>":()V
       4: return
    LineNumberTable:
      line 7: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0       5     0  this   LLazyTest;
}

Note how foo() invokes bar$1 which in turn invoked bar$lzycompute$1. The latter is an object-level private method calling monitorenter and monitorexit.

alexandru commented 10 years ago

So monitorenter is applied on the stack param loaded with aload_0, which should be this. Is that correct? That doesn't make sense, why not put the lock on VolatileByteRef (which I presume is the Boolean that indicates whether that Int has been initialized)?

Nice catch. Lazy vals never made sense to me within functions. Will merge, though I think a more detailed explanation is needed.

ejstrobel commented 9 years ago

What if the lazy val depends on function arguments, and is used multiple times in some flows of control, but not at all in another? And if there are no concurrency issues, because no other thread is touching this? I find that sometimes lazy vals can make code more readable than nested defs and repeated assignments.

I am not sure this rule is generic enough for this list. It depends on a current implementation quirk of in function lazy vals, and on your particular use case of concurrent access. The concurrency issue should be filed as bug report rather.

For a small and contrived example compare:

def foo(x: Int, y: Int): Int = {
  lazy val z = expensive(y)
  if (x > 0 && z > 0) z else x
}

def foo(x: Int, y: Int): Int = {
  if (x > 0) {
    val z = expensive(y)
    if (z > 0) z else x
  } else x
}
leothekim commented 9 years ago

I don't know what you mean by "generic enough" as there are some very specific rules, e.g. 2.1. MUST NOT use "return", in this list. The concurrency issue is definitely a gotcha, but I'd argue that if you declared a lazy val in a function that doesn't get used in certain flows, the function can be rewritten to be clearer. We replaced lazy vals declared in functions with nested defs, and this did not make the code any less readable. In some cases, this made the code more readable.

On Wed, Dec 3, 2014 at 3:04 PM, Jürgen Strobel notifications@github.com wrote:

What if the lazy val depends on function arguments, and is used multiple times in some flows of control, but not at all in another? And if there are no concurrency issues, because no other thread is touching this? I find that sometimes lazy vals can make code more readable than nested defs and repeated assignments.

I am not sure this rule is generic enough for this list. It depends on a current implementation quirk of in function lazy vals, and on your particular use case of concurrent access. The concurrency issue should be filed as bug report rather.

— Reply to this email directly or view it on GitHub https://github.com/alexandru/scala-best-practices/pull/13#issuecomment-65479354 .

-- My PGP public key can be found by exploring the following link:

-- http://preview.tinyurl.com/34ztb5

ejstrobel commented 9 years ago

I meant that the rule depends on some specific circumstances, and it may or may not improve the readability, while MUST NOT use return is true all the time. Are you sure that your solution of using nested defs is good for everyone? How would you redo my example above with a nested def?

leothekim commented 9 years ago

Sorry, I didn't see your example - it didn't come through in email! I don't think the second form is any less readable, but I see what you're getting at. I can change the rule to be "SHOULD NOT", given that lazy val behavior in functions isn't currently specified (though it currently behaves in a nasty way). My personal opinion is that they should still be avoided in functions as we've been successfully rewritten code that had used them in the past and maintained or even improved readability.