mckennalab / FlashFry

FlashFry: The rapid CRISPR target site characterization tool
Other
63 stars 10 forks source link

Too many open files #2

Closed moritzschaefer closed 6 years ago

moritzschaefer commented 6 years ago

When running the example in your README.md I get the following exception:


Exception in thread "main" java.io.IOException: Too many open files
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.createTempFile(File.java:2024)
        at crispr.BinWriter.$anonfun$new$1(BinWriter.scala:47)
        at crispr.BinWriter.$anonfun$new$1$adapted(BinWriter.scala:46)
        at scala.collection.Iterator.foreach(Iterator.scala:929)
        at scala.collection.Iterator.foreach$(Iterator.scala:929)
        at utils.BaseCombinationIterator.foreach(BaseCombinationGenerator.scala:58)
        at crispr.BinWriter.<init>(BinWriter.scala:46)
        at modules.BuildOffTargetDatabase.$anonfun$runWithOptions$1(BuildOffTargetDatabase.scala:67)
        at modules.BuildOffTargetDatabase.$anonfun$runWithOptions$1$adapted(BuildOffTargetDatabase.scala:61)
        at scala.Option.map(Option.scala:146)
        at modules.BuildOffTargetDatabase.runWithOptions(BuildOffTargetDatabase.scala:61)
        at main.scala.Main$.$anonfun$new$2(Main.scala:81)
        at main.scala.Main$.$anonfun$new$2$adapted(Main.scala:74)
        at scala.Option.map(Option.scala:146)
        at main.scala.Main$.delayedEndpoint$main$scala$Main$1(Main.scala:74)
        at main.scala.Main$delayedInit$body.apply(Main.scala:57)
        at scala.Function0.apply$mcV$sp(Function0.scala:34)
        at scala.Function0.apply$mcV$sp$(Function0.scala:34)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App.$anonfun$main$1$adapted(App.scala:76)
        at scala.collection.immutable.List.foreach(List.scala:378)
        at scala.App.main(App.scala:76)
        at scala.App.main$(App.scala:74)
        at main.scala.Main$.main(Main.scala:57)
        at main.scala.Main.main(Main.scala)

Increasing my open-files-limit from 4096 to 80000 solved the issue, though not everyone may have the option to increase the limit

aaronmck commented 6 years ago

Thanks for the feedback; I imagine other people will run into this as well. To avoid high memory usage we streaming-write individual prefix bins to disk while we create the target database. This means all of the bins have open file handles at that point. I think it should be pretty straightforward to buffer writing for each bin, which should eliminate this problem. Thanks for the report, I'll leave this open until it's fixed.

aaronmck commented 6 years ago

This should be fixed in release 1.7