hexpm / hex

Package manager for the Erlang ecosystem.
https://hex.pm
954 stars 184 forks source link

Regression in 2.1.0? #1029

Closed jschniper closed 1 month ago

jschniper commented 1 month ago

I'm receiving the following error with the latest version:

** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "decimal", "2.1.1"}}, 120000)                                                                                                           
    ** (EXIT) time out                                                                                                                                                           
    (elixir 1.16.1) lib/gen_server.ex:1114: GenServer.call/3                                               
    (hex 2.1.0) lib/hex/scm.ex:147: Hex.SCM.update/1                                    
    (hex 2.1.0) lib/hex/scm.ex:246: Hex.SCM.checkout/1                                  
    (mix 1.16.1) lib/mix/dep/fetcher.ex:64: Mix.Dep.Fetcher.do_fetch/3                  
    (mix 1.16.1) lib/mix/dep/converger.ex:229: Mix.Dep.Converger.all/8                  
    (mix 1.16.1) lib/mix/dep/converger.ex:244: Mix.Dep.Converger.all/8                  
    (mix 1.16.1) lib/mix/dep/converger.ex:162: Mix.Dep.Converger.init_all/8             
    (mix 1.16.1) lib/mix/dep/converger.ex:146: Mix.Dep.Converger.all/4       

The only thing that I can see about the decimal dependency that is at all special is it appeared in multiple places in the umbrella app and each of one those had an override: true option because of a library with an out of date version. I was able to remove the override: true and then upgrade that dependency and mix deps.get finished as expected.

I'll try to dig into this a bit more tomorrow and see if I can be of more help.

jschniper commented 1 month ago

One more data point, I had similar problem with {:jose, "~> 1.11", override: true} and removing the override took care of it. That one only appeared once in the umbrella app.

ericmj commented 1 month ago

Can you give us a way to reproduce the error?

steve-pieris commented 1 month ago

We just started getting a similar looking error but for a different package:

** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "cowboy", "2.9.0"}}, 120000)
    ** (EXIT) time out
    (elixir 1.12.3) lib/gen_server.ex:1024: GenServer.call/3
    (hex 2.1.0) lib/hex/scm.ex:147: Hex.SCM.update/1
    (hex 2.1.0) lib/hex/scm.ex:246: Hex.SCM.checkout/1
    (mix 1.12.3) lib/mix/dep/fetcher.ex:64: Mix.Dep.Fetcher.do_fetch/3
    (mix 1.12.3) lib/mix/dep/converger.ex:190: Mix.Dep.Converger.all/9
    (mix 1.12.3) lib/mix/dep/converger.ex:201: Mix.Dep.Converger.all/9
    (mix 1.12.3) lib/mix/dep/converger.ex:123: Mix.Dep.Converger.all/7
    (mix 1.12.3) lib/mix/dep/converger.ex:108: Mix.Dep.Converger.all/4
Ganarajr commented 1 month ago

Getting the same issue for a different package

Getting google_gax (Hex package)
** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "google_gax", "0.3.1"}}, 120000)
    ** (EXIT) time out
    (elixir 1.14.1) lib/gen_server.ex:1038: GenServer.call/3
    (hex 2.1.0) lib/hex/scm.ex:147: Hex.SCM.update/1
    (hex 2.1.0) lib/hex/scm.ex:246: Hex.SCM.checkout/1
    (mix 1.14.1) lib/mix/dep/fetcher.ex:64: Mix.Dep.Fetcher.do_fetch/3
    (mix 1.14.1) lib/mix/dep/converger.ex:213: Mix.Dep.Converger.all/9
    (mix 1.14.1) lib/mix/dep/converger.ex:224: Mix.Dep.Converger.all/9
    (mix 1.14.1) lib/mix/dep/converger.ex:146: Mix.Dep.Converger.all/7
    (mix 1.14.1) lib/mix/dep/converger.ex:131: Mix.Dep.Converger.all/4
ericmj commented 1 month ago

It can probably happen for any package, but we need a way to reproduce it to fix the issue. Please share your mix project including the mix.exs and mix.lock files.

barrieloydall commented 1 month ago

Seeing the same issue with another dependency.

Only change between ci builds was the update from hex-2.0.6 to hex-2.1.0

We see the issue with html_entities, a dependency of floki, and floki is used by an in house library. Due to a breaking change some time ago we pinned the version of html_entities with override: true in our library that uses floki.

e.g. {:html_entities, "0.5.1", override: true},

I was able to avoid the error by additionally including the html_entities dependency definition in our main service application that used the library.

main_service (:html_entities, "0.5.1", override: true)
│ │──in_house_library (:html_entities, "0.5.1", override: true)
│    │──floki
│       │──html_entities

Seeing the following error:

* Getting html_entities (Hex package)
** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "html_entities", "0.5.1"}}, 120000)
    ** (EXIT) time out
    (elixir 1.15.7) lib/gen_server.ex:1074: GenServer.call/3
    (hex 2.1.0) lib/hex/scm.ex:147: Hex.SCM.update/1
    (hex 2.1.0) lib/hex/scm.ex:246: Hex.SCM.checkout/1
    (mix 1.15.7) lib/mix/dep/fetcher.ex:64: Mix.Dep.Fetcher.do_fetch/3
    (mix 1.15.7) lib/mix/dep/converger.ex:229: Mix.Dep.Converger.all/8
    (mix 1.15.7) lib/mix/dep/converger.ex:244: Mix.Dep.Converger.all/8
    (mix 1.15.7) lib/mix/dep/converger.ex:162: Mix.Dep.Converger.init_all/8
    (mix 1.15.7) lib/mix/dep/converger.ex:146: Mix.Dep.Converger.all/4

Exited with code exit status 1

In the output for "Resolving Hex dependencies..." html_entities was no longer listed under the "Unchanged:" list.

ericmj commented 1 month ago

Please folks, you gotta help me out here. Just reporting the same thing does not help.

Please share a way for us to reproduce the error. The easiest way to do is by sharing your mix.exs file (all of them if it's an umbrella app) and the mix.lock file.

barrieloydall commented 1 month ago

Please folks, you gotta help me out here. Just reporting the same thing does not help.

Please share a way for us to reproduce the error. The easiest way to do is by sharing your mix.exs file (all of them if it's an umbrella app) and the mix.lock file.

hi @ericmj it can be a little difficult to share private mix files as they can contain other private dependencies and information. What details from the files would help specifically?

I will try and create a stripped back minimal version to help repo the issue.

steve-pieris commented 1 month ago

We also use private mix files (which possibly might be partly the cause of this timeout behaviour).
I managed to reproduce the issue in a stripped down project.

Using the same version as our existing app, Elixir v1.12.3 and phx 1.6.2 created a new umbrella project, ie:

mix archive.install hex phx_new 1.6.2
mix phx.new deps_timeout_test --umbrella
cd deps_timeout_test
mix deps.get

At this point everything works fine.

Then update apps/deps_timeout_test_web/mix.ex to include a private forked version of an esaml dep that we have customised, ie: NOTE: i've replaced the fork and sha for privacy

      {:esaml, "~> 4.2.0",
        github: "<fork>/esaml", ref: "<gitsha>"}

then clear the deps and attempt to get again: NOTE: the clear is important because our CI build server always start fresh with no fetched deps

mix deps.clean --all
mix deps.get

and the we get the error:

* Getting cowboy (Hex package)
** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "cowboy", "2.12.0"}}, 120000)
    ** (EXIT) time out
    (elixir 1.12.3) lib/gen_server.ex:1024: GenServer.call/3
    (hex 2.1.0) lib/hex/scm.ex:147: Hex.SCM.update/1
    (hex 2.1.0) lib/hex/scm.ex:246: Hex.SCM.checkout/1
    (mix 1.12.3) lib/mix/dep/fetcher.ex:64: Mix.Dep.Fetcher.do_fetch/3
    (mix 1.12.3) lib/mix/dep/converger.ex:190: Mix.Dep.Converger.all/9
    (mix 1.12.3) lib/mix/dep/converger.ex:201: Mix.Dep.Converger.all/9
    (mix 1.12.3) lib/mix/dep/converger.ex:123: Mix.Dep.Converger.all/7
    (mix 1.12.3) lib/mix/dep/converger.ex:108: Mix.Dep.Converger.all/4

NOTE: When I inspect the esaml entry in the mix.lock file I do not see the cowboy dependency

If I update the apps/deps_timeout_test_web/mix.ex again to use the public esaml, ie:

{:esaml, "~> 4.2.0"}

and run

mix deps.clean --all
mix deps.get

I get this error:

Resolving Hex dependencies...
Resolution completed in 0.051s
Because "the lock" specifies cowboy 2.12.0 and esaml >= 4.0.0 and < 4.3.0 depends on cowboy 2.6.0, the lock is incompatible with esaml >= 4.0.0 and < 4.3.0.
And because your app depends on the lock, esaml >= 4.0.0 and < 4.3.0 is forbidden.
So, because your app depends on esaml ~> 4.2.0, version solving failed.

Our forked version of esaml has bumped cowboy in it's rebar.config to 2.9.0. But if there is a version mismatch I would expect an error instead of a timeout.

Ganarajr commented 1 month ago

We tried (https://github.com/hexpm/hex/issues/1019#issuecomment-2002087278 for google_gax) which works, here is the updated mix.exs

{:google_gax, git: "https://github.com/googleapis/elixir-google-api.git", tag: "ba9bf3ac88905ef3009bd043aac5b32679aa1052", sparse: "clients/gax", override: true},

However, we now get the same error with poison

 Getting poison (Hex package)
** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "poison", "3.1.0"}}, 120000)
    ** (EXIT) time out
    (elixir 1.14.1) lib/gen_server.ex:1038: GenServer.call/3
    (hex 2.1.0) lib/hex/scm.ex:147: Hex.SCM.update/1
    (hex 2.1.0) lib/hex/scm.ex:246: Hex.SCM.checkout/1
    (mix 1.14.1) lib/mix/dep/fetcher.ex:64: Mix.Dep.Fetcher.do_fetch/3
    (mix 1.14.1) lib/mix/dep/converger.ex:213: Mix.Dep.Converger.all/9
    (mix 1.14.1) lib/mix/dep/converger.ex:224: Mix.Dep.Converger.all/9
    (mix 1.14.1) lib/mix/dep/converger.ex:146: Mix.Dep.Converger.all/7
    (mix 1.14.1) lib/mix/dep/converger.ex:131: Mix.Dep.Converger.all/4
ericmj commented 1 month ago

hi @ericmj it can be a little difficult to share private mix files as they can contain other private dependencies and information. What details from the files would help specifically?

We need the deps entry from mix.exs and the full mix.lock file.

lud-wj commented 1 month ago

@ericmj I cannot share my mix.exs either because you would not be able to pull our private libraries.

But I ran a quick debug session and it seems that Hex.Parallel does not receive a {:run, ...} call for the tarball.

RUN {:registry, "hexpm", "singleton"} await=false
ENQUEUE {:registry, "hexpm", "singleton"}
RUN_TASK {:registry, "hexpm", "singleton"}
SEND REPLY {:registry, "hexpm", "singleton"}
Resolving Hex dependencies...
Resolution completed in 0.523s
Unchanged:
  [...]
* Getting singleton (Hex package)
AWAIT {:tarball, "hexpm", "singleton", "1.3.2"}
** (exit) exited in: GenServer.call(:hex_fetcher, {:await, {:tarball, "hexpm", "singleton", "1.3.2"}}, 5000)
    ** (EXIT) time out
    (elixir 1.16.2) lib/gen_server.ex:1114: GenServer.call/3

(timeout of 5000 is my change)

In my case we have a dependency that depends on singleton with override: true. To be clear the override flag is the direct dependency mix.exs file, not in ours, and it is not needed AFAIK because our direct dependency is the only one depending on singleton. That specific case does not seem to be a requirement for the bug though.

(not sure if this is helpful, I don't know how hex works)

stefanluptak commented 1 month ago

For those that have their CI pipelines failing because of this, replace this: mix local.hex --force with this: mix local.hex 2.0.6 --force in your CI scripts/workflows.

flambard commented 1 month ago

We are seeing this issue with one of our umbrella projects that has an Erlang application in it. We had problems with the dependencies of that application (hackney and jsx).

wojtekmach commented 1 month ago

For anyone running into this error, please consider narrowing this down to a minimal reproduction by removing dependencies and re-running and then sending us a mix.exs file. It might be enough to just have:

# mix.exs
defmodule Foo do
  def project do
    [app: :foo, version: "1.0.0", deps: deps()]
  end

  defp deps do
    [
      # ...
    ]
  end
end

If mix.lock matters, that is you can't reproduce this without it existing prior, please send it too. A Mix.install invocation would be perfectly fine too. Thank you.

stefanluptak commented 1 month ago

For anyone running into this error, please consider narrowing this down to a minimal reproduction by removing dependencies and re-running and then sending us a mix.exs file. It might be enough to just have:

# mix.exs
defmodule Foo do
  def project do
    [app: :foo, version: "1.0.0", deps: deps()]
  end

  defp deps do
    [
      # ...
    ]
  end
end

If mix.lock matters, that is you can't reproduce this without it existing prior, please send it too. A Mix.install invocation would be perfectly fine too. Thank you.

https://github.com/stefanluptak/hex_debug

Just clone it, and run mix deps.get from the root of the repo with hex 2.1.0

wojtekmach commented 1 month ago

@stefanluptak perfect, I was able to reproduce the issue.

jschniper commented 1 month ago

So a little more data on this one:

Sample Mix files that are enough to trigger it:

defmodule A.MixProject do
  use Mix.Project

  def project do
    [
      app: :a,
      version: "0.1.0",
      build_path: "../../_build",
      config_path: "../../config/config.exs",
      deps_path: "../../deps",
      lockfile: "../../mix.lock",
      elixir: "~> 1.16",
      start_permanent: Mix.env() == :prod,
      deps: deps()
    ]
  end

  # Run "mix help compile.app" to learn about applications.
  def application do
    [
      extra_applications: [:logger],
      mod: {A.Application, []}
    ]
  end

  # Run "mix help deps" to learn about dependencies.
  defp deps do
    [
      {:b, in_umbrella: true}
    ]
  end
end
defmodule B.MixProject do
  use Mix.Project

  def project do
    [
      app: :b,
      version: "0.1.0",
      build_path: "../../_build",
      config_path: "../../config/config.exs",
      deps_path: "../../deps",
      lockfile: "../../mix.lock",
      elixir: "~> 1.16",
      start_permanent: Mix.env() == :prod,
      deps: deps()
    ]
  end

  # Run "mix help compile.app" to learn about applications.
  def application do
    [
      extra_applications: [:logger],
      mod: {B.Application, []}
    ]
  end

  # Run "mix help deps" to learn about dependencies.
  defp deps do
    [
      {:pg_ranges, "1.1.0"},
      {:decimal, "~> 2.0", override: true}
    ]
  end
end
CuriousCurmudgeon commented 1 month ago

We don't have a minimal reproduction, but we are seeing the same issue in a non-umbrella app. Explicitly using 2.0.6 fixed it for us as well.

SamHutchings commented 1 month ago

no minimal repro but adding another datapoint - we are seeing the same error when running mix do deps.get --only prod, deps.compile --skip-umbrella-children.

the package which times out is prometheus version 4.11.0. we also have a private repo for Oban pro. it is installed in an umbrella app mix file as {:prometheus, "~> 4.0", override: true},

ericmj commented 1 month ago

We don't have a minimal reproduction, but we are seeing the same issue in a non-umbrella app. Explicitly using 2.0.6 fixed it for us as well.

@CuriousCurmudgeon Do you have a non-minimal reproduction?

EDIT: nvm, I also found a way to reproduce without an umbrella.

ceolinrenato commented 1 month ago

Hey @ericmj thanks for the fix!

Is this going to be published as 2.1.1?

ericmj commented 1 month ago

Yes. 2.1.1 has been published.