typelevel / skunk

A data access library for Scala + Postgres.
https://typelevel.org/skunk/
MIT License
1.59k stars 163 forks source link

Pool$ResourceLeak in ScalaCheck Weaver suite #572

Open nkgm opened 3 years ago

nkgm commented 3 years ago
import cats.effect._
import skunk._
import skunk.implicits._
import natchez.Trace.Implicits.noop
import skunk.codec.text.text
import weaver.IOSuite
import weaver.scalacheck.{CheckConfig, Checkers}
import org.scalacheck.Gen

object SkunkWeaverSuite extends IOSuite with Checkers {

  type Res = Resource[IO, Session[IO]]

  // high count with low pool size (100, 1) triggers the issue consistently but will still trigger for 
  // low count / high pool size (1, 10), though not every time.
  override def checkConfig: CheckConfig = CheckConfig.default.copy(minimumSuccessful = 100)

  override def sharedResource: Resource[IO, Res] = {
    Session
      .pooled[IO](
        host = "localhost",
        port = 5432,
        user = "dev",
        database = "dev",
        max = 1,
      )
  }

  test("skunk") { pool =>
    forall(Gen.alphaNumStr)
    { _ =>
      pool.use {
        pg => pg.execute(sql"select version();".query(text)).map { v =>
//          println(s"Connected to Postgres $v")
          expect(!v.isEmpty) // irrelevant
        }
      }

    }
  }

}
[error] - Unexpected failure 0ms
[error]   Pool$ResourceLeak: 
[error]   πŸ”₯  
[error]   πŸ”₯  ResourceLeak
[error]   πŸ”₯  
[error]   πŸ”₯    Problem: A resource leak was detected during pool finalization.
[error]   πŸ”₯     Detail: Expected 1 active slot(s) and no deferrals, found 0 slots and 2
[error]   πŸ”₯             deferral(s).
[error]   πŸ”₯       Hint: The most common causes of resource leaks are (a) using a pool on a
[error]   πŸ”₯             fiber that was neither joined or canceled prior to pool finalization,
[error]   πŸ”₯             and (b) using `Resource.allocated` and failing to finalize allocated
[error]   πŸ”₯             resources prior to pool finalization.
[error]   πŸ”₯  
[error] 
[error]   skunk.util.Pool$ResourceLeak: A resource leak was detected during pool finalization.
[error] 
[error]   Pool.scala:156            skunk.util.Pool$#$anonfun$of$12
[error]   RunnerCompat.scala:225    void @ weaver.framework.RunnerCompat$ConcurrentQueueEventBroker#send
[error]   Pull.scala:1180           >>$extension @ fs2.Pull$#fs2$Pull$$go$1
[error]   Stream.scala:2047         complete @ fs2.Stream#$anonfun$parEvalMap$2
[error]   Stream.scala:2041         get @ fs2.Stream#$anonfun$parEvalMap$2
[error]   RunnerCompat.scala:158    parTraverse$extension @ weaver.framework.RunnerCompat#$anonfun$run$9
[error]   Stream.scala:2047         complete @ fs2.Stream#$anonfun$parEvalMap$2
[error]   Stream.scala:2041         get @ fs2.Stream#$anonfun$parEvalMap$2
[error]   RunnerCompat.scala:158    parTraverse$extension @ weaver.framework.RunnerCompat#$anonfun$run$9
[error]   RunnerCompat.scala:158    parTraverse$extension @ weaver.framework.RunnerCompat#$anonfun$run$9
[error]   RunnerCompat.scala:150    productR @ weaver.framework.RunnerCompat#$anonfun$run$9
[error]   RunnerCompat.scala:150    productR @ weaver.framework.RunnerCompat#$anonfun$run$9
[error]   Pool.scala:148            get @ skunk.util.Pool$#free$1

This proved to be an issue when Skunk is used with Weaver and ScalaCheck. Not sure who's at fault here but hopefully this issue will serve as a good starting point.

Starting sbt console and running test or ~test will produce this error on every single run, and querying for select count(*) from pg_stat_activity on psql will give an accumulated result += poolSize after each failed test run. Quitting and restarting sbt gets rid of the dangling connections.

keynmol commented 2 years ago

Glad I remembered this issue :) I can reproduce it without Scalacheck, in the following build: https://github.com/keynmol/jobby/actions/runs/3209965828/jobs/5247080202

What's fascinating is that in the same build I have tests that use exactly the same resource (that includes Skunk) (integrationTests) and they've never failed.

I attempted to give a more self-contained version of OP's snippet, with Scala CLI (run it with scala-cli test file.scala):

//> using lib "org.tpolecat::skunk-core:0.3.2"
//> using lib "com.disneystreaming::weaver-cats:0.8.0"
//> using testFramework "weaver.framework.CatsEffect"
//> using lib "com.disneystreaming::weaver-scalacheck:0.8.0"
//> using scala "2"

import cats.effect._
import skunk._
import skunk.implicits._
import natchez.Trace.Implicits.noop
import skunk.codec.text.text
import weaver._
import weaver.scalacheck._
import org.scalacheck.Gen

object SkunkWeaverSuite extends IOSuite with Checkers {

  type Res = Resource[IO, Session[IO]]

  // high count with low pool size (100, 1) triggers the issue consistently but will still trigger for
  // low count / high pool size (1, 10), though not every time.
  override def checkConfig: CheckConfig =
    CheckConfig.default.copy(minimumSuccessful = 100)

  override def sharedResource: Resource[IO, Res] = {
    Session
      .pooled[IO](
        host = "localhost",
        port = 5432,
        user = "postgres",
        database = "postgres",
        password = Some("mysecretpassword"),
        max = 50
      )
  }

  test("skunk") { pool =>
    forall(Gen.alphaNumStr) { _ =>
      pool.use {
        pg =>
          pg.execute(sql"select version();".query(text)).map { v =>
            expect(!v.isEmpty) // irrelevant
          }
      }

    }
  }

}

But no matter how much I run it I can't seem to reproduce the issue.

So this leaves Weaver and Skunk, I will try to run skunk with debug logging to see if there's any hints.

keynmol commented 2 years ago

Here's the full sequence of debug logging:

2022.10.08 11:06:58:866 io-compute-9 INFO 🐳 [testcontainers/ryuk:0.3.3]
    Creating container for image: testcontainers/ryuk:0.3.3
    Container testcontainers/ryuk:0.3.3 is starting: 74e76644f679a4ede92aff2a563bbdfd28dcbf9add0ead2ea393999114d92130
    Container testcontainers/ryuk:0.3.3 started in PT0.536578S
[info] + landing page 4s
[info] + landing page 2 4s
 β†’ StartupMessage(test,test,Map(client_min_messages -> WARNING, DateStyle -> ISO, MDY, IntervalStyle -> iso_8601, client_encoding -> UTF8))
 β†’ StartupMessage(test,test,Map(client_min_messages -> WARNING, DateStyle -> ISO, MDY, IntervalStyle -> iso_8601, client_encoding -> UTF8))
 ← AuthenticationSASL(SCRAM-SHA-256)
 ← AuthenticationSASL(SCRAM-SHA-256)
 β†’ SASLInitialResponse(SCRAM-SHA-256, ByteVector(33 bytes, 0x6e2c2c6e3d2a2c723d45547d6c5323396f24737c6d6b51392f2f34732f254a6f51))
 β†’ SASLInitialResponse(SCRAM-SHA-256, ByteVector(33 bytes, 0x6e2c2c6e3d2a2c723d3e5830487a6857327634656429535e796a4e2a644679764e))
 ← AuthenticationSASLContinue(ByteVector(84 bytes, 0x723d3e5830487a6857327634656429535e796a4e2a644679764e51417557447535782b327a6370693272617a61324d6441392c
733d7744772b30574f77376c626e317459514438304b67773d3d2c693d34303936))
 ← AuthenticationSASLContinue(ByteVector(84 bytes, 0x723d45547d6c5323396f24737c6d6b51392f2f34732f254a6f514474724c4a6e49434a4567417678366964703565747556372c
733d7744772b30574f77376c626e317459514438304b67773d3d2c693d34303936))
 β†’ SASLResponse(ByteVector(104 bytes, 0x633d626977732c723d45547d6c5323396f24737c6d6b51392f2f34732f254a6f514474724c4a6e49434a4567417678366964703565747556372
c703d6a62475447614f6c484d71727364724e7747777476484967576d64384e53436d6b43524c54666c393557773d))
 ← AuthenticationSASLFinal(ByteVector(46 bytes, 0x763d4c47587350586a656a4d4e654e4649434970654c3679585156556f59416b34496f4a5139586e6f48504a383d))
 β†’ SASLResponse(ByteVector(104 bytes, 0x633d626977732c723d3e5830487a6857327634656429535e796a4e2a644679764e51417557447535782b327a6370693272617a61324d6441392
c703d386a68394763766c75486e683672557131334849693067797269482f566f69786c4c51746630676b306a673d))
 ← AuthenticationOk
 ← ParameterStatus(application_name,)
 ← ParameterStatus(client_encoding,UTF8)
 ← ParameterStatus(DateStyle,ISO, MDY)
 ← 
πŸ”₯  
πŸ”₯  EofException
πŸ”₯  
πŸ”₯    Problem: EOF was reached on the network socket.
πŸ”₯     Detail: Attempt to read 5 byte(s) failed after 0 bytes(s) were read, because
πŸ”₯             the connection had closed.
πŸ”₯       Hint: Discard this session and retry with a new one.
πŸ”₯  

skunk.exception.EofException: EOF was reached on the network socket.
 ← ParameterStatus(default_transaction_read_only,off)
 ← ParameterStatus(in_hot_standby,off)
 ← ParameterStatus(integer_datetimes,on)
 ← ParameterStatus(IntervalStyle,iso_8601)
 ← ParameterStatus(is_superuser,on)
 ← ParameterStatus(server_encoding,UTF8)
 ← ParameterStatus(server_version,14.2 (Debian 14.2-1.pgdg110+1))
 ← ParameterStatus(session_authorization,test)
 ← ParameterStatus(standard_conforming_strings,on)
 ← ParameterStatus(TimeZone,Etc/UTC)
 ← BackendKeyData(73,299016235)
 ← ReadyForQuery(Idle)
 ← ErrorResponse(F -> be-secure.c, M -> terminating connection due to unexpected postmaster exit, V -> FATAL, L -> 207, C -> 57P01, R -> secure_read, S -> 
FATAL
 ← 
πŸ”₯  
πŸ”₯  EofException
πŸ”₯  
πŸ”₯    Problem: EOF was reached on the network socket.
πŸ”₯     Detail: Attempt to read 5 byte(s) failed after 0 bytes(s) were read, because
πŸ”₯             the connection had closed.
πŸ”₯       Hint: Discard this session and retry with a new one.
πŸ”₯  

skunk.exception.EofException: EOF was reached on the network socket.
 β†’ Query(
          SELECT oid typid, typname, typarray, typrelid
          FROM   pg_type
          WHERE typnamespace IN (
            SELECT oid
            FROM   pg_namespace
            WHERE nspname = ANY(current_schemas(true))
          )
        )
[info] - Unexpected failure 0ms
[info] *************FAILURES**************
[info] jobby.tests.frontend.UsersSpec
[error] - Unexpected failure 0ms
[error]   Pool$ResourceLeak: 
[error]   πŸ”₯  
[error]   πŸ”₯  ResourceLeak
[error]   πŸ”₯  
[error]   πŸ”₯    Problem: A resource leak was detected during pool finalization.
[error]   πŸ”₯     Detail: Expected 10 active slot(s) and no deferrals, found 8 slots and 0
[error]   πŸ”₯             deferral(s).
[error]   πŸ”₯       Hint: The most common causes of resource leaks are (a) using a pool on a
[error]   πŸ”₯             fiber that was neither joined or canceled prior to pool finalization,
[error]   πŸ”₯             and (b) using `Resource.allocated` and failing to finalize allocated
[error]   πŸ”₯             resources prior to pool finalization.
[error]   πŸ”₯  
[error] 
[error]   skunk.util.Pool$ResourceLeak: A resource leak was detected during pool finalization.
[error] 
[error]   Pool.scala:19                skunk.util.Pool$ResourceLeak$#apply
[error]   Pool.scala:156               skunk.util.Pool$#free$1$$anonfun$1$$anonfun$2
[error]   Applicative.scala:264        cats.Applicative#whenA
[error]   Applicative.scala:39         cats.Applicative#whenA$
[error]   RunnerCompat.scala:257       void @ weaver.framework.RunnerCompat$ConcurrentQueueEventBroker#send
[error]   RunnerCompat.scala:181       productR @ weaver.framework.RunnerCompat#run$$anonfun$1$$anonfun$2$$anonfun$1$$anonfun$1
[error]   RunnerCompat.scala:181       productR @ weaver.framework.RunnerCompat#run$$anonfun$1$$anonfun$2$$anonfun$1$$anonfun$1
[error]   GlobalResourceF.scala:143    get @ weaver.GlobalResourceF$ResourceMap#rawGet
[error]   Pool.scala:162               flatMap @ skunk.util.Pool$#free$1
[error]   Pull.scala:1209              >>$extension @ fs2.Pull$#fs2$Pull$$$_$go$1
[error]   PooledResource.scala:156     foreverM$extension @ org.tpolecat.poolparty.PooledResource$#org$tpolecat$poolparty$PooledResource$$$make
[error]   RunnerCompat.scala:257       void @ weaver.framework.RunnerCompat$ConcurrentQueueEventBroker#send
[error]   InterruptContext.scala:91    get @ fs2.internal.InterruptContext#eval$$anonfun$1
[error] 
[error] Error: Total 3, Failed 0, Errors 1, Passed 2
[error] Error during tests:
[error]         jobby.tests.frontend.UsersSpec
[error] (backend / Test / testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 9 s, completed 8 Oct 2022, 11:07:06