Closed avdv closed 1 month ago
@ndmitchell has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
Thanks for the changes! This looks great. I'm heading off on holiday until Tuesday, so have sent it off for internal testing, but will probably land it once I'm back.
This pull request has been reverted by 10096f7bbcb7694fa08d0881eae976c35824c984.
@avdv afraid I had to revert this as it breaks in some circumstances. Just debugging which circumstances.
The issue is when you have a Haskell library with deps = [":a", ":a"]
- so the same dependency twice. That then generates an entry for the Haskell package database which is malformed and GHC complains. I've now added such an instance to our internal test suite (which we're working on open sourcing slowly...) so just trying to reapply.
79d74b500f35be4819a636e4d7901c4c29eee9de was the magic to make this work, so relanded as 340e7df97c2027e6e2376365662de23a4c35c239
Bad news - there was a significant performance regression from this code. It generates command lines that upset GHC. I don't yet know why, but investigating. Reverted once more.
There are two places this code causes a significant performance regression. In both cases it comes from the fact that multiple packages might exist within the same package_db. Before, since we deduplicated everything by value, you might have 100 packages and 20 package_dbs. If you give GHC 100 package dbs (even if there are only 20 unique ones) the performance crashes dramatically. That's especially true if you have a package_db representing all of stackage, which both occurs a lot and is slow to load. I've spun up #624 as a version with your two patches and my deduplication code in. Aiming to land that next week, but being cautious given this has gone wrong twice now 😄
There are two places this code causes a significant performance regression. In both cases it comes from the fact that multiple packages might exist within the same package_db. Before, since we deduplicated everything by value, you might have 100 packages and 20 package_dbs. If you give GHC 100 package dbs (even if there are only 20 unique ones) the performance crashes dramatically.
Oh, that's interesting. Thank you for pushing this forward!
To update, there ended up being a third place, which I fixed too. All sorted now and seems to have had no regressions for a while. Thanks for the contribution!
This PR changes the fields of HaskellLibraryProvider to contain a transitive set of HaskellLibraryInfo instead of using a list.
merge_haskell_link_infos
since it is now unused