Summary:
Apply D56345659 and D56241179 (which were reverted in D56364484) along with performance optimisations on the fact that multiple packages might exist within the same package_db. Before, since we deduplicated everything by value, you might have 100 packages and 20 package_dbs. If you give GHC 100 package dbs (even if there are only 20 unique ones) the performance crashes dramatically. That's especially true if you have a package_db representing all of stackage, which both occurs a lot and is slow to load. The two places are:
When we construct GHC_PACKAGE_PATH env var. These are fairly trivially deduped with a dict construction
When we do packagedb_args.add. We traverse the TSet in the loop above, so dedupe to a dict then too.
Summary: Apply D56345659 and D56241179 (which were reverted in D56364484) along with performance optimisations on the fact that multiple packages might exist within the same package_db. Before, since we deduplicated everything by value, you might have 100 packages and 20 package_dbs. If you give GHC 100 package dbs (even if there are only 20 unique ones) the performance crashes dramatically. That's especially true if you have a package_db representing all of stackage, which both occurs a lot and is slow to load. The two places are:
Differential Revision: D56378115