dkpro / dkpro-jwpl

DKPro JWPL (DKPro Java Wikipedia Library) is a free, Java-based application programming interface that facilitates access to all information in Wikipedia.
https://dkpro.github.io/dkpro-jwpl
Apache License 2.0
83 stars 35 forks source link

Transition from javax to jakarta #226

Closed rzo1 closed 1 year ago

rzo1 commented 1 year ago

The transition from the "javax" namespace to "Jakarta" in the Java ecosystem is a significant development with far-reaching implications.

This issue aims to move JWPL to "jakarta.*".

JWPL mainly relies on two EE specs, which require some work in order to sort it out:

As this will break consumers (at least at runtime), it needs to come with a major version bump

rzo1 commented 1 year ago

FYI: Working on a proof of concept impl at the moment: https://github.com/rzo1/dkpro-jwpl

Open tasks:

I need to handle the CLA stuff and can submit a PR for further review.

rzo1 commented 1 year ago

Performance tests are working with the jakartarized version of JWPL with the sql dump from https://github.com/dkpro/dkpro-jwpl/issues/2#issuecomment-402049454 against a MariaDB in Heilbronn.

2023-10-18 11:05:49,626 DEBUG [main] api.PerformanceIT (PerformanceIT.java:121) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:05:49,626 DEBUG [main] api.PerformanceIT (PerformanceIT.java:121) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:06:09,196 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:06:09,196 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:06:09,196 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 1
2023-10-18 11:06:09,196 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 1
2023-10-18 11:06:09,200 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 19567ms
2023-10-18 11:06:09,200 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 19567ms
2023-10-18 11:06:09,201 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:06:09,201 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:06:09,202 DEBUG [main] api.PerformanceIT (PerformanceIT.java:127) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:06:09,202 DEBUG [main] api.PerformanceIT (PerformanceIT.java:127) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:06:11,179 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:06:11,179 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:06:11,180 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 10
2023-10-18 11:06:11,180 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 10
2023-10-18 11:06:11,180 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 1976ms
2023-10-18 11:06:11,180 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 1976ms
2023-10-18 11:06:11,180 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:06:11,180 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:06:11,181 DEBUG [main] api.PerformanceIT (PerformanceIT.java:133) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:06:11,181 DEBUG [main] api.PerformanceIT (PerformanceIT.java:133) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:06:11,664 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:06:11,664 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:06:11,665 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 50
2023-10-18 11:06:11,665 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 50
2023-10-18 11:06:11,665 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 483ms
2023-10-18 11:06:11,665 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 483ms
2023-10-18 11:06:11,665 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:06:11,665 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:06:11,666 DEBUG [main] api.PerformanceIT (PerformanceIT.java:115) - extern page loading and field accessing
2023-10-18 11:06:11,666 DEBUG [main] api.PerformanceIT (PerformanceIT.java:115) - extern page loading and field accessing
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:135) - -----------------
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:135) - -----------------
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:136) - average throughput: 0.001748758684980031 pages/ms
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:136) - average throughput: 0.001748758684980031 pages/ms
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - average throughput: 1.748758684980031 pages/s
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - average throughput: 1.748758684980031 pages/s
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - -----------------
2023-10-18 11:07:36,849 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - -----------------
2023-10-18 11:07:36,850 DEBUG [main] api.PerformanceIT (PerformanceIT.java:109) - intern page loading and field accessing
2023-10-18 11:07:36,850 DEBUG [main] api.PerformanceIT (PerformanceIT.java:109) - intern page loading and field accessing
2023-10-18 11:07:45,635 DEBUG [main] api.PerformanceTest (PerformanceTest.java:135) - -----------------
2023-10-18 11:07:45,635 DEBUG [main] api.PerformanceTest (PerformanceTest.java:135) - -----------------
2023-10-18 11:07:45,635 DEBUG [main] api.PerformanceTest (PerformanceTest.java:136) - average throughput: 0.014643587221810936 pages/ms
2023-10-18 11:07:45,635 DEBUG [main] api.PerformanceTest (PerformanceTest.java:136) - average throughput: 0.014643587221810936 pages/ms
2023-10-18 11:07:45,636 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - average throughput: 14.643587221810936 pages/s
2023-10-18 11:07:45,636 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - average throughput: 14.643587221810936 pages/s
2023-10-18 11:07:45,636 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - -----------------
2023-10-18 11:07:45,636 DEBUG [main] api.PerformanceTest (PerformanceTest.java:137) - -----------------
2023-10-18 11:07:45,636 DEBUG [main] api.PerformanceIT (PerformanceIT.java:139) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:07:45,636 DEBUG [main] api.PerformanceIT (PerformanceIT.java:139) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:07:45,914 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:07:45,914 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:07:45,915 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 100
2023-10-18 11:07:45,915 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 100
2023-10-18 11:07:45,915 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 278ms
2023-10-18 11:07:45,915 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 278ms
2023-10-18 11:07:45,915 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:07:45,915 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:07:45,916 DEBUG [main] api.PerformanceIT (PerformanceIT.java:145) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:07:45,916 DEBUG [main] api.PerformanceIT (PerformanceIT.java:145) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:07:46,117 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:07:46,117 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:07:46,117 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 200
2023-10-18 11:07:46,117 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 200
2023-10-18 11:07:46,117 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 201ms
2023-10-18 11:07:46,117 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 201ms
2023-10-18 11:07:46,118 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:07:46,118 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:07:46,118 DEBUG [main] api.PerformanceIT (PerformanceIT.java:151) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:07:46,118 DEBUG [main] api.PerformanceIT (PerformanceIT.java:151) - Test: retrieve 4000 pages - buffer = '1000' ...
2023-10-18 11:07:46,259 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:07:46,259 DEBUG [main] api.PerformanceTest (PerformanceTest.java:187) - RetrievedPages  : 1000
2023-10-18 11:07:46,260 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 500
2023-10-18 11:07:46,260 DEBUG [main] api.PerformanceTest (PerformanceTest.java:188) - Used Buffer Size: 500
2023-10-18 11:07:46,260 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 140ms
2023-10-18 11:07:46,260 DEBUG [main] api.PerformanceTest (PerformanceTest.java:189) - Time            : 140ms
2023-10-18 11:07:46,261 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:07:46,261 DEBUG [main] api.PerformanceTest (PerformanceTest.java:190) - ------------------------------
2023-10-18 11:07:46,261 DEBUG [main] api.PerformanceIT (PerformanceIT.java:103) - extern page loading
2023-10-18 11:07:46,261 DEBUG [main] api.PerformanceIT (PerformanceIT.java:103) - extern page loading
2023-10-18 11:07:52,928 DEBUG [main] api.PerformanceTest (PerformanceTest.java:102) - -----------------
2023-10-18 11:07:52,928 DEBUG [main] api.PerformanceTest (PerformanceTest.java:102) - -----------------
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceTest (PerformanceTest.java:103) - average throughput: 0.02059393989573966 pages/ms
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceTest (PerformanceTest.java:103) - average throughput: 0.02059393989573966 pages/ms
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceTest (PerformanceTest.java:104) - average throughput: 20.593939895739663 pages/s
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceTest (PerformanceTest.java:104) - average throughput: 20.593939895739663 pages/s
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceTest (PerformanceTest.java:105) - -----------------
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceTest (PerformanceTest.java:105) - -----------------
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceIT (PerformanceIT.java:96) - intern page loading
2023-10-18 11:07:52,929 DEBUG [main] api.PerformanceIT (PerformanceIT.java:96) - intern page loading
2023-10-18 11:07:54,816 DEBUG [main] api.PerformanceTest (PerformanceTest.java:102) - -----------------
2023-10-18 11:07:54,816 DEBUG [main] api.PerformanceTest (PerformanceTest.java:102) - -----------------
2023-10-18 11:07:54,817 DEBUG [main] api.PerformanceTest (PerformanceTest.java:103) - average throughput: 0.05303066896268009 pages/ms
2023-10-18 11:07:54,817 DEBUG [main] api.PerformanceTest (PerformanceTest.java:103) - average throughput: 0.05303066896268009 pages/ms
2023-10-18 11:07:54,817 DEBUG [main] api.PerformanceTest (PerformanceTest.java:104) - average throughput: 53.03066896268009 pages/s
2023-10-18 11:07:54,817 DEBUG [main] api.PerformanceTest (PerformanceTest.java:104) - average throughput: 53.03066896268009 pages/s
2023-10-18 11:07:54,817 DEBUG [main] api.PerformanceTest (PerformanceTest.java:105) - -----------------
2023-10-18 11:07:54,817 DEBUG [main] api.PerformanceTest (PerformanceTest.java:105) - -----------------
rzo1 commented 1 year ago

The sql dump is a bit too big for GH actions / the repo but it can be used to run it locally for example.