haskell-infra / hackage-trustees

Issue tracker for Hackage maintainance and trustee operations
https://hackage.haskell.org/packages/trustees/
42 stars 7 forks source link

Revise upper bound on base for cryptohash-sha1 to build with GHC 9.2 #319

Closed tchoutri closed 2 years ago

tchoutri commented 2 years ago

Same as https://github.com/haskell-infra/hackage-trustees/issues/313 for GHC 9.2 and base-4.16.

andreasabel commented 2 years ago

While a base bump would be sufficient for building the library, the tests and benchmarks are actually blocked upstream, see:

Bodigrim commented 2 years ago

@phadej intended to release updated cryptohash-sha1 soon, https://github.com/haskell-hvr/cryptohash-md5/pull/7#issuecomment-948372848, so maybe we'd rather not to rush with revisions.

phadej commented 2 years ago

A status update that I'm a bit distracted as I'd like to get benchmarks to build with GHC-9.2 too. That means following the aeson & criterion dependencies first. And there I'm waiting for unordered-containers hashable-1.4 support to not need to do a second sweep.

EDIT: Yes I know I could swtitch to e.g. tasty-bench, but I'd still need to update aeson, so it wouldn't save total work for me.

tchoutri commented 2 years ago

@phadej Okay, thank you for the explanations

phadej commented 2 years ago

The next blocker is https://github.com/actions/runner/issues/1326 GHA seems to unreliably fail, very often. I run out of ideas what might be the cause. (old GHCs like 7.6 are not resource hungry). It doesn't make much sense to restart jobs if 4 out of 12 fail. (Single job restart would help, it's on roadmap, but there is no ETA).

I don't know if @emilypi or other HF folks (@Bodigrim) have any GitHub contacts to point out to that issue. I don't feel very confident updating the libs. Also GHC-8.0 job fails, and I'd rather not drop GHC support just because CI is unreliable.

ethomson commented 2 years ago

Hi @phadej - is there a way to get more verbosity or diagnostics out of that failure? It looks like it is failing in a dependency installation? Could this be a networking problem? I apologize that I'm not very familiar with Haskell, could you tell me where these assets are being downloaded from?

phadej commented 2 years ago

They are already downloaded. At that point, e.g. in https://github.com/haskell-hvr/cryptohash-md5/runs/4191743221?check_suite_focus=true

Starting     criterion-measurement-0.1.3.0 (lib)
Building     criterion-measurement-0.1.3.0 (lib)
Installing   criterion-measurement-0.1.3.0 (lib)
Completed    criterion-measurement-0.1.3.0 (lib)
Installing   statistics-0.15.2.0 (lib)
Completed    statistics-0.15.2.0 (lib)
Starting     criterion-1.5.11.0 (lib)
Building     criterion-1.5.11.0 (lib)
Error: The operation was canceled.

everything is already downloaded. There are lines like

Downloading  base-orphans-0.8.6
Downloaded   base-orphans-0.8.6
Downloading  call-stack-0.4.0
Downloaded   call-stack-0.4.0

above in the same step.

Let me try to make a variant which is a bit more explicit about what is happening. However then I need to reduce the parallelism from 2 to 1, and I suspect that then the job might actually succeed then.

So I doubt it's a networking problem. In aeson failing jobs (like https://github.com/haskell/aeson/runs/4186746204?check_suite_focus=true) the failing step is an actual library build step. So it's not a dependency installation problem either.

The only sure thing is that compiling aeson and criterion is resource intensive, but as describe in the issue linked it doesn't seem that process are running out of memory, because in these cases runtime system would report the failure and job would fail gracefully. However, this jobs are slow. It almost feels like memory (or/and cpu?) hungry jobs are stripped of CPU power, so they are simply slow and actually timeout.

ethomson commented 2 years ago

Thanks for the information. We're investigating.

phadej commented 2 years ago

@ethomson looks like this https://github.com/haskell-hvr/cryptohash-md5/runs/4192987633?check_suite_focus=true job is going to fail. It will have a bit more output (and timestamps for build tool output (not the compilers))

ethomson commented 2 years ago

@phadej we made a few changes to some VM configuration. Can I ask you to queue a new build?

phadej commented 2 years ago

@ethomson sure:

ethomson commented 2 years ago

Very useful data. We changed the VM configuration as a test for runs in the haskell-hvr org. We did not change haskellari. This gives us more confidence that the VM configuration changes will be effective here. Working on applying this more broadly...

Bodigrim commented 2 years ago

The only sure thing is that compiling aeson and criterion is resource intensive, but as describe in the issue linked it doesn't seem that process are running out of memory, because in these cases runtime system would report the failure and job would fail gracefully. However, this jobs are slow. It almost feels like memory (or/and cpu?) hungry jobs are stripped of CPU power, so they are simply slow and actually timeout.

I recently observed the same issue with jobs more resource intensive than usual: they tend to run up to 30 minutes and then either succeed quickly or come to a halt until 360 minute timeout. Hard to debug in detail, the jobs in question are on s390x emulated machine. Emulation is terribly expensive and likely to eat all 7 Gb RAM available to workers, could it be an effect of swapping? I'd still expect them to finish in an hour at max, and local tests show no signs of freezing.

https://github.com/haskell/bytestring/actions/workflows/s390x.yml

ethomson commented 2 years ago

@phadej and @bodigrim we've rolled out the configuration to 100% - if you could please retry any jobs that were failing and let me know if you're still seeing problems!

phadej commented 2 years ago

@ethomson i queued few jobs and they all passed. Thanks!

Bodigrim commented 2 years ago

@ethomson much better now, thanks!

phadej commented 2 years ago

https://hackage.haskell.org/package/cryptohash-sha1-0.11.101.0 released

andreasabel commented 2 years ago

Someone likes binary here...

tchoutri commented 2 years ago

@ethomson Thank you very much for the help!!