Top-level namespace breaks jdeps analysis

dharrigan commented 2 years ago

Hi,

I'm experimenting creating a JRE from a JDK as outlined here, in this blog post: https://blog.adoptium.net/2021/10/jlink-to-produce-own-runtime/. Unfortunately, I'm receiving this error when attempting to analyse the jdeps of my jar:

Caused by: java.lang.module.InvalidModuleDescriptorException: clj_tuple__init.class found in top-level directory (unnamed package not allowed in module)
    at java.base/jdk.internal.module.ModulePath.toPackageName(ModulePath.java:720)

This is because I'm using clj-http 3.12.3 which uses potemkin/potemkin 0.4.5 which has a dependency upon clj-tuple 0.2.2.

I'm wondering, is there a way to replace clj-tuple in potemkin - considering clj-tuple is archived by the author and perhaps there is no need for it (since in potemkin, it's only used in one place, for the t/vector here https://github.com/clj-commons/potemkin/blob/3e404364ae2fd32f7a53b362a79d2012ab958ab2/src/potemkin/utils.clj#L97. Perhaps the built-in vector form in Clojure is fine to use instead?

KingMob commented 1 year ago

Hmmm. We removed clj-tuple from byte-streams for similar reasons (Graal also dislikes top-level namespaces), but 1) it wasn't on a critical path, and 2) byte-streams isn't too popular.

I just profiled clj-tuple's vector, and it's still 58% faster than core vector for smaller vector sizes (< ~50 elts), so I'm reluctant to replace it, even though it's only used for fast-memoize (which ironically, is NOT always faster than core memoize).

However, removing clj-tuple is probably a moot point, because you're going to run into the exact same issue with potemkin itself. Potemkin has a top-level namespace too, which will cause the same problem. (Many clojure libs from that era did this, until they figured out it was a bad idea.)

So really, the issue is to move away from a top-level ns. We can do what we did for byte-transforms: move all code under a new level, and change popular dependencies not to use the top-level version of fns. I don't know if that will actually please jdeps, but it's worth a shot.

KingMob commented 1 year ago

Side note to self: clj-tuple's vector is faster, but also more likely to make call sites megamorphic, because each vector size is a separate class. In the case of fast-memoize, it's making a vector for each arity, so functions called with 3+ different arities will go megamorphic, which is way worse than any benefits.

clj-commons / potemkin

Top-level namespace breaks jdeps analysis #64