Open jmkeil opened 3 weeks ago
This is very much on point. There is a general problem in building an executable that ends up being terribly large because of all possible dependencies on features that a specific user may not need ... I am not sure how to fix this in the short term. One way could be to declare the dependency as "provided" in the module pom and add it in the launchers' build. I wonder whether there is any good practice we can refer to.
it would be nice to make the build more a la carte but let's not clobber the ability to easily run the headless browser. i have had to use this capability several times.
I do not deny the existence of use cases for the headless variant. Therefore, my request was about a lightweight Triplifier in addition.
@jmkeil I'd like to elaborate a strategy to cope with this issue and I think your case is perfect for that. Can you please clarify how are you using the package? Are you embedding the maven package in your own build? If this is the case, it should be as easy as marking the dependencies as provided
in the POM and including them only in the runnable builds. Would that work?
I do not use it yet, but I consider to use it soon, to enable the import of data in several formats into my pipeline based tool. However, it is my concern that the binaries will become quiet large. I thought about either use sparql-anything-engine
but exclude the maven packages for a few formats that I do not need or which are to large, or in the first place to only use the sparql-anything-*
maven packages, I actually want. Having HTML among the supported formats would be nice, but increasing the binaries size by a magnitude wouldn't be worth it to me.
The package
io.github.sparql-anything.sparql-anything-html
has a heavy storage footprint (>160MB) due to its dependency oncom.microsoft.playwright.driver-bundle
, which basically five times ships Node.js binaries (Windows, Linux, Linux ARM, Mac and Mac ARM). To my understanding, this is needed to run a headless browser that interprets JS in the triplified HTML.I guess this is not needed in many use cases.
Therefore, I would like to ask you to consider providing an additional lightweight HTML Triplifier that just triplifies the static HTML document. This would result in significantly smaller binaries of upstream projects and would probably also improve the execution time.