amnaredo / test

0 stars 0 forks source link

Potential file handle leak #264

Open amnaredo opened 3 years ago

amnaredo commented 3 years ago

We found this downstream in mill and track it as https://github.com/lihaoyi/mill/issues/754 .

@ajrnz was able to track it down to upickle, somewhere between 0.9.0 and 0.9.5. I do just quote his comment (https://github.com/lihaoyi/mill/issues/754#issuecomment-586977490) here:

I've just had a quick look at this. It appears to be a problem somewhere under upickle.default.read.

Here is a minimized test case:

object MillTest extends App {
  //val baseDir = os.pwd
  val baseDir = os.home / "dev" / "ext" / "mill"
  val x =
    os.walk(baseDir)
      .filter(_.last == "meta.json")
      .map{ p =>
      println(s"Loading: $p")
      upickle.default.read[ujson.Value](p.toIO)
    }

  println("waiting")
  Thread.sleep(1000*1000)
}

Using:

    ivy"com.lihaoyi::upickle:0.9.5",
    ivy"com.lihaoyi::os-lib:0.6.2"

If you find the java PID and then do a lsof -p <pid> you see all the files are open.

Appears to have been introduced between upickle 0.9.0 and 0.9.5.

Don't have more time to dig at the moment but hope this helps...

ID: 303 Original Author: lefou

amnaredo commented 3 years ago

Poking around the code, this seems to come down to https://github.com/lihaoyi/upickle/blob/master/ujson/src/ujson/Readable.scala#L22 which uses https://github.com/lihaoyi/upickle/blob/master/ujson/src/ujson/InputStreamParser.scala which has a no-op close method. Is there a reason it doesn't call data.close() instead?

I notice that the Transformer instance calls parse which calls close at https://github.com/lihaoyi/upickle/blob/master/ujson/src/ujson/Parser.scala#L114.

Perhaps some InputStreamParsers need to not close the InputStream and some need to close it? Original Author: nafg

amnaredo commented 3 years ago

@nafg you are right. I think the logic to close the input stream should live in https://github.com/lihaoyi/upickle/blob/master/ujson/src/ujson/Readable.scala, since it's conceivable that an InputStreamParser may be used on a longer inputstream, whereas a Readable is defined to be a one-shot affair that cleans up after itself. Original Author: lihaoyi

amnaredo commented 3 years ago

To me that argues more that Parser#parse shouldn't [necessarily] call close so that InputStreamParser shouldn't be forced to have a no-op close method. I just noticed that data is a val so it does remain possible to call data.close() from outside class InputStreamParser. However I don't see where in Readable.scala it could go, since it just uses object InputStreamReader.

I did a github search for InputStreamParser, and it seems to only be used in Readable.scala (and tests). I think the one place where close should probably not be called is when adapting a geny.Readable: https://github.com/lihaoyi/upickle/blob/d90588faf1eb4c503bc23c3487d015d24870663d/ujson/src/ujson/Readable.scala#L32

Original Author: nafg

amnaredo commented 3 years ago

it could go in fromTransformer#transform I think? https://github.com/lihaoyi/upickle/blob/master/ujson/src/ujson/Readable.scala#L12 Original Author: lihaoyi

amnaredo commented 3 years ago

Should this be closed now Original Author: nafg

amnaredo commented 3 years ago

i think so Original Author: lihaoyi