Open anthonyvdotbe opened 4 years ago
You are asking questions that people not familiarized with modules cannot answer easily. And no, there aren't 5 or more module combinations that make sense (per your Q1). The maintainers of this project hoped that we could reach some agreement, but it is obviously difficult if not impossible.
everyone's interest for a final decision to be taken sooner rather than later
No, not everyone is in a hurry about that. It has been explained that there won't be any answer before August (and note that "not before August" is not the same as "in August"), and I am fine if for example my question in issue #25 is answered in August or September.
And that's because in the meantime, we are (probably) going to have 1.5 with the automatic module name.
version 1.5 is released with minimal changes [...] In particular: no automatic module name, no changes to the dependency versions
It is not reasonable to block the solution to a serious problem (the lack of a declared module name) because of hypothetical backwards compatibility issues in 2.0 (which is going to be called 2.0 precisely because some backward incompatibilities are possible). My proposed XOM module split (#25) for 2.0 would affect the subset of developers that use htmlparser
+ XOM, which is likely to be small (while your proposal for a saxtree
module would be more invasive and still keeps the XOM problem inside).
Finally, I try to keep a constructive dialogue but your rejection of dependency version upgrades looks like opposing just for the sake of it. Could we spare this modularization drama for now, release 1.5 and retake the matter sometime later?
Replying to your edit of the original post:
I'm fine with a 1.5 release as it is (except for the use of version ranges, but I'll argument that in the PR).
This is a library and not the final application, so using carefully chosen version ranges is a good practice (albeit not mandatory). I gave more details in my response to your comment in the PR.
First of all, I acknowledge the annoyance of having a main branch owner who isn't up-to-speed with Java Modules (it's taken me way more time in terms of calendar distance than I hoped to get some idea of Java Modules) and also is as slow to respond as I have been. Sorry.
Currently, technically the compile-time-hard and run-time-optional dependency on both encoding detectors and on XOM are handled the same way. I see an issue open about XOM, but XOM seems to have more degrees of freedom for the solution, since if you use XOM, you already choose a XOM-specific API entry point anyway. However, the encoding detectors are something you can optionally enable regardless of entry point, so in that sense solving those should solve XOM, too, unless it's a problem for XOM types to appear in the outward API.
Considering the way Java worked up to and including 8, it would have been a backward compatibility bug to change the fully-qualified name of a class that remains otherwise compatible. That is, if you wrote an app with htmlparser.jar
in the Java 5 days (and didn't explicitly call the HTML4-enablement methods that will be removed for 2.0), I think it's prima facie a bug if you have to make source changes to your app when you drop in the Java Modules-enabled future nu-validator-htmlparser.jar
.
So far, I've understood that there are now restrictions on what packages nu-validator-htmlparser.jar
is allowed to provide, and that it's not allowed to provide nu.validator.saxtree
. Correct? OK, minting another .jar
for that, while disappointing, isn't that bad, since callers don't need to change any fully-qualified names.
https://github.com/validator/htmlparser/issues/25 says that while requires static
would work, it would defeat the purpose of modularization for the public API of the nu.validator.htmlparser
module to expose types from nu.xom
if nu.xom
is optional.
While I can guess that "formally wrong" is bad, is there some articulation of how the badness would manifest in practice?
OTOH, https://github.com/carlosame/htmlparser/tree/xom-removed/src/nu/validator/htmlparser and https://github.com/carlosame/htmlparser-xom/tree/master/src/nu/validator/htmlparser/xom suggest that the module system does allow a module called nu.validator.htmlparser
to ship packages called nu.validator.htmlparser.foo
while a module called nu.validator.htmlparser.xom
ships a sibling package named nu.validator.htmlparser.xom
. Since that still doesn't change any fully-qualified names, I guess that isn't any worse than having to ship nu-validator-saxtree.jar
as a separate jar.
Considering that the validator project has over the span of its existence gone from me spreading stuff over multiple repos to @sideshowbarker merging some of the repos, I'm a bit uneasy about additional repos and am leaning towards keeping the XOM stuff in this repo if permitted by the rules of Modules and Maven.
Regarding a Maven dependency, I'd hope that after all changes, it would still be possible to point Eclipse to the source directories and the dependency jars and have stuff build without Maven.
As for JDK target, I believe the source code is still Java 5-compatible. Not using new language constructs matters for the Java-to-C++ translator, but actually running the code on a Java 5 JVM is unlikely to be a use case worth supporting. Running on OpenJDK 8 and recent-ish Android does seem relevant, though. Do I understand correctly that modularized jars can be released with the byte code compatibility level set to 8 and then pre-Modules user just dumps all the jars in the classpath, the older JVM ignores the Module manifests, and that's it?
Going back to the encoding detectors: Ideally, the parser would depend on a Java port of chardetng, but one does not exist. The ICU detector isn't very good. In the absence of a Java port of chardetng, jchardet can make some sense. Do I understand correctly that the blocker for depending on it even via requires static
is that the project itself doesn't publish with a module manifest?
I've never used the RPM and OSGi stuff myself. Someone contributed them at some point. I have no idea if someone still cares. Probably prudent to leave that stuff in for 1.5 if it's easy to do so, drop that stuff for 2.0 and see if anyone complains.
it would have been a backward compatibility bug to change the fully-qualified name of a class that remains otherwise compatible
Well, none of the changes being discussed involves changing the fully-qualified name of any class (except for the conversion that I did of the unit test in the proposed new XOM repo where I put it in its parent package, but I do not think that you are talking about this).
you have to make source changes to your app when you drop in the Java Modules-enabled future
nu-validator-htmlparser.jar
No you do not need to make source changes if you use Java 7 or 8 (at least if we discuss my modularization patches, are we?). If you use later JDKs, you need to require
the declared package name, that's it. And the jar file can be htmlparser.jar
as always.
it's not allowed to provide
nu.validator.saxtree
. Correct?
There is no such a restriction.
is there some articulation of how the badness would manifest in practice?
The point of modularization is to know which modules you require to do certain work. If you start saying "well this may require this and that, or it may not", then modularizing is moot. My point is: if you want to keep the Jar file as it is (containing all the current packages), there is no adavantage in doing a full modularization (which brings its own problem with it: filename-based dependencies), and the project should be happy with the automatic module name.
am leaning towards keeping the XOM stuff in this repo
That's fine, but then my above comment applies.
(PS: it would be great if somebody volunteered to write at least one unit test for XOM :slightly_smiling_face:)
it would still be possible to point Eclipse to the source directories and the dependency jars and have stuff build without Maven.
Yes Maven is just a deployment-stage thing (although you can use a special "Maven" kind of Eclipse project, but I personally do not like them).
Just to clarify: you do not need to put Maven as a dependency in Eclipse.
modularized jars can be released with the byte code compatibility level set to 8 and then pre-Modules user just dumps all the jars in the classpath, the older JVM ignores the Module manifests, and that's it?
If we are discussing my modular patches (the only ones that have been presented) they are compatible with Java 7 and 8, and at the same time modular JDKs (JDK 11+) can get their module-info
. I have been using this configuration successfully in css4j since I set it up about a year ago.
the blocker for depending on it even via
requires static
is that the project itself doesn't publish with a module manifest?
jchardet
does not supply an automatic module name in its MANIFEST.MF
. This is a PITA (as this project currently is...), but in this particular case it could be less harmful because I do not expect a modular jchardet
to ever exist (the project seems abandoned). So in principle, no clash between 'old' and 'new' module names in that case (still bad though).
First of all, I acknowledge [...]
Thanks, I appreciate it.
Considering the way Java worked up to and including 8, it would have been a backward compatibility bug to change the fully-qualified name of a class that remains otherwise compatible. That is, if you wrote an app with
htmlparser.jar
in the Java 5 days (and didn't explicitly call the HTML4-enablement methods that will be removed for 2.0), I think it's prima facie a bug if you have to make source changes to your app when you drop in the Java Modules-enabled futurenu-validator-htmlparser.jar
.
Actually, I would like to move some internal classes in nu.validator.htmlparser.common
(e.g. the Interner
interface) to a non-exported package as part of the modularization in 2.0. So in theory this would require source changes, but in practice I doubt anyone would (and believe no one should) be relying on any of those internal classes. Other than this, no source changes would be required.
So far, I've understood that there are now restrictions on what packages
nu-validator-htmlparser.jar
is allowed to provide, and that it's not allowed to providenu.validator.saxtree
. Correct?
No, the only technical restriction is that 2 Java modules cannot contain the same package (no matter whether they're exported).
However, a guideline to avoid running into that restriction, is to use the package name of the top-most package as the module name (and to avoid module names that are a prefix of another module name, except for aggregate modules). So if we want to follow that guideline, there ought to be at least 2 modules, nu.validator.htmlparser
and nu.validator.saxtree
(or the module should be named nu.validator
, or nu.validator.saxtree
should be moved to nu.validator.htmlparser.saxtree
).
W.r.t. modularization, I believe the library should be geared towards the users that "simply" want to parse HTML5 via one of the 3 APIs. So another reason why I'd like 2 separate Java modules, is that allows to effectively hide the saxtree
package from users of the htmlparser
module. With a single module, the saxtree
package would need to be exported, but would be noise to nearly all of the users.
OK, minting another
.jar
for that, while disappointing, isn't that bad, since callers don't need to change any fully-qualified names.
What makes you say it's disappointing?
Regarding a Maven dependency, I'd hope that after all changes, it would still be possible to point Eclipse to the source directories and the dependency jars and have stuff build without Maven.
Yes, that'll still be possible, no matter how the modularization is done. (I'm curious why you want Eclipse to build stuff though, rather than relying on Maven or Gradle. If you open a Maven project with Eclipse (i.e. no .classpath
, .project
etc., just a pom.xml
), then Eclipse will simply generate those files by looking at the POM.)
As for JDK target, I believe the source code is still Java 5-compatible. Not using new language constructs matters for the Java-to-C++ translator, but actually running the code on a Java 5 JVM is unlikely to be a use case worth supporting. Running on OpenJDK 8 and recent-ish Android does seem relevant, though. Do I understand correctly that modularized jars can be released with the byte code compatibility level set to 8 and then pre-Modules user just dumps all the jars in the classpath, the older JVM ignores the Module manifests, and that's it?
Exactly.
Going back to the encoding detectors: Ideally, the parser would depend on a Java port of chardetng, but one does not exist. The ICU detector isn't very good. In the absence of a Java port of chardetng, jchardet can make some sense. Do I understand correctly that the blocker for depending on it even via
requires static
is that the project itself doesn't publish with a module manifest?
Yes. Either jchardet
support should be removed, or a new version with a module descriptor should be published. An alternative is to just "fork" it (e.g. create a new repo under https://github.com/validator, or just add it as a separate module in this repo).
Finally, I'd like to reiterate that, in my opinion, Q1 in my OP is the essential question here. I.e. we shouldn't be concerned about project organisation in terms of Maven modules and/or Git repos at this point: no matter what the answer to Q1 is, it's possible to organise the project however you want (though some combinations wouldn't make sense, of course. For example if we'd go with a single Java module, it wouldn't make sense to use multiple Maven modules or Git repos). In this regard, what are your thoughts on the first point?
Actually, I would like to move some internal classes in nu.validator.htmlparser.common (e.g. the Interner interface) to a non-exported package as part of the modularization in 2.0.
What problem would this solve?
So in theory this would require source changes, but in practice I doubt anyone would (and believe no one should) be relying on any of those internal classes.
TransitionHandler
there was added for NetBeans. Not sure what its usage is today.
nu.validator.htmlparser.impl
is deliberately public so that you can write TreeBuilder
subclasses that aren't provided by this project. (Also, spinning of XOM into a different module would require TreeBuilder
to be subclassable from that module.) Things under common
are visible in the signatures of impl
.
What makes you say it's disappointing?
Mainly having to replace one jar with many when upgrading and the resulting proliferation of jar. Maybe that's not a real problem.
Yes. Either
jchardet
support should be removed, or a new version with a module descriptor should be published. An alternative is to just "fork" it (e.g. create a new repo under https://github.com/validator, or just add it as a separate module in this repo).
It's disappointing if the existing jar can't be used as-is. Considering that I'd like to get rid of jchardet
eventually, I'm really not keen on de facto becoming responsible for jchardet
's distribution beyond its usage by htmlparser
. Realistically, though, the replacement I want is vaporware without a concrete path into becoming real software, so in that sense "eventually" can be expected to be far off anyway.
(That is, porting encoding_rs
and chardetng
to Java are not going to happen as part of my job. Writing a JNI wrapper for them seems like an educational hobby project of realistic scope, but a JNI dependency wouldn't be viewed favorably by the Java community for real-world deployment. Doing a translation that would result either .java
files or Java bytecode seems like too large for what I have hobby project time for.)
Is there really no backward compatibility mechanism that would allow the existing jchardet
to be used as-is? @carlosame's most recent comment suggests that there is.
Finally, I'd like to reiterate that, in my opinion, Q1 in my OP is the essential question here.
I'm leaning towards three Java Modules:
nu.validator.htmlparser
, nu.validator.saxtree
, and nu.validator.htmlparser.xom
.
However, this is based on a pre-Modules view of the world where DOM and SAX are assumed to be always present, so perhaps it would make sense to treat sax
and dom
the same way as xom
, i.e.:
nu.validator.htmlparser.sax
depends on nu.validator.htmlparser
, nu.validator.saxtree
, java.xml
, and java.base
.nu.validator.htmlparser.dom
depends on nu.validator.htmlparser
, java.xml
, and java.base
.nu.validator.htmlparser.xom
depends on nu.validator.htmlparser
, nu.xom
, and java.base
.Considering that nu.xom
depends on SAX, it's unclear to me what problem this would solve other than allowing custom tree builders that use none of SAX, DOM, or XOM not to depend on java.xml
.
Since you've used Modules and I haven't, do you see any benefit from decoupling the core of the parser from java.xml
Module-wise?
Regardless of the module division, I'd prefer all these to stay in this git repo. (As noted, we used to have more repos for the validator as a whole and moved towards having fewer.) What that means for Maven, I don't know.
My point of departure is: only export each of the dom/sax/xom
packages plus the "transitive closure" of their API.
So in practice this means: additionally export the common
package, except for the following interfaces: ByteReadable
, EncodingDeclarationHandler
, Interner
and TokenHandler
. (I've only found 2 projects which make advanced use of this project: env-js, which hasn't seen a commit in 9 years, and NetBeans, which doesn't use any of these interfaces. So I believe that moving them would not make a difference in practice. And even if it would, the loss of backward compatibility would be more than compensated by the gain of a minimal exported API.)
By only exporting a minimal API:
While I understand TreeBuilder
is a supported API, I believe the impl
package shouldn't be exported, at least not as-is and not without actual demand:
saxtree
: it's useful on its own, but its direct usage is likely limited to a small percentage of this project's consumersW.r.t. project organization, I propose the following: a single Git repo, containing a single multi-module Maven project, containing a number of Maven modules, each defining a single Java module.
For the Java modules, I'd start with your first division + nu.validator.jchardet
, where nu.validator.htmlparser
would only export 3 packages: common
, dom
, sax
.
Then when there's an actual demand for exporting additional packages, we could see how to best meet that demand (it's trivial to create a GitHub issue, so I'm sure people would do so soon enough if they're relying on some of the unexported API. And in the meantime they could just apply the necessary command-line flags or put the JARs on the classpath to work around any module restrictions). One possible solution would be something like the following in that case:
* `nu.validator.jchardet` { exports `jchardet`; }
* `nu.validator.saxtree` { exports `saxtree`; }
* `nu.validator.htmlparser.common` { exports `common`; requires transitive `java.xml`; }
* `nu.validator.htmlparser.base` { exports `treebuilder`; requires transitive `nu.validator.htmlparser.common`; requires `nu.validator.jchardet`; }
* `nu.validator.htmlparser.dom` { exports `dom`; requires transitive `nu.validator.htmlparser.common`; requires `nu.validator.htmlparser.base`; }
* `nu.validator.htmlparser.sax` { exports `sax`; requires transitive `nu.validator.htmlparser.common`; requires `nu.validator.htmlparser.base`, `nu.validator.saxtree`; }
* `nu.validator.htmlparser.xom` { exports `xom`; requires transitive `nu.validator.htmlparser.common`, `nu.xom`; requires `nu.validator.htmlparser.base`; }
Some notes:
nu.validator.htmlparser
into a common
and base
module. This allows to maintain the minimal API in the dom/sax/xom
modulesnu.validator.htmlparser.base
module (I made up the number after looking at the history & noting that other packages have barely seen 2 commits in the past 6 years)nu.validator.htmlparser
would just be an aggregate of nu.validator.htmlparser.dom
and nu.validator.htmlparser.sax
)What makes you say it's disappointing?
Mainly having to replace one jar with many when upgrading and the resulting proliferation of jar. Maybe that's not a real problem.
I don't see the problem: when I update a dependency declaration in the POM, I hardly ever notice that the new version has introduced additional transitive dependencies.
Yes. Either
jchardet
support should be removed, or a new version with a module descriptor should be published. An alternative is to just "fork" it (e.g. create a new repo under https://github.com/validator, or just add it as a separate module in this repo).It's disappointing if the existing jar can't be used as-is. Considering that I'd like to get rid of
jchardet
eventually, I'm really not keen on de facto becoming responsible forjchardet
's distribution beyond its usage byhtmlparser
. Realistically, though, the replacement I want is vaporware without a concrete path into becoming real software, so in that sense "eventually" can be expected to be far off anyway.(That is, porting
encoding_rs
andchardetng
to Java are not going to happen as part of my job. Writing a JNI wrapper for them seems like an educational hobby project of realistic scope, but a JNI dependency wouldn't be viewed favorably by the Java community for real-world deployment. Doing a translation that would result either.java
files or Java bytecode seems like too large for what I have hobby project time for.)Is there really no backward compatibility mechanism that would allow the existing
jchardet
to be used as-is? @carlosame's most recent comment suggests that there is.
Yes, one can use any JAR as-is, and the JVM will deduce a module name from the JAR's filename. However, this is rather fragile and precludes some usages of the modules that require it (e.g. I'm pretty sure that it's impossible to use such modules with jlink
to generate a custom runtime image).
In the proposal above, I added it as a separate module, but if preferred, we could even add it as a non-exported package in the nu.validator.htmlparser.base
module.
Considering that
nu.xom
depends on SAX, it's unclear to me what problem this would solve other than allowing custom tree builders that use none of SAX, DOM, or XOM not to depend onjava.xml
.Since you've used Modules and I haven't, do you see any benefit from decoupling the core of the parser from
java.xml
Module-wise?
Maybe I'm misunderstanding your question, but I doubt it's possible to decouple any piece of the parser from java.xml
, since e.g. SAXException
is used in the public API of pretty much every package.
classes which are not of interest to users don't show up in the Javadoc, IDE autocompletion suggestions, etc.
The clutter in Javadoc doesn't look too bad. Not sure about how much the internals show up in IDE autocomplete in practice for people who don't work on the internals.
maintainers have maximum flexibility w.r.t. backward compatibility
AFAICT, in the 13-year lifespan of this project, the one API-breaking change (as opposed to parser behavior correctness change) is the removal of support for the HTML4 mode, which would not have become non-breaking had the change you propose been made ahead of time.
nu.validator.htmlparser.xom.HtmlBuilder
has these nu.validator
imports:
import nu.validator.htmlparser.common.CharacterHandler;
import nu.validator.htmlparser.common.DocumentModeHandler;
import nu.validator.htmlparser.common.Heuristics;
import nu.validator.htmlparser.common.TokenHandler;
import nu.validator.htmlparser.common.TransitionHandler;
import nu.validator.htmlparser.common.XmlViolationPolicy;
import nu.validator.htmlparser.impl.ErrorReportingTokenizer;
import nu.validator.htmlparser.impl.Tokenizer;
import nu.validator.htmlparser.io.Driver;
nu.validator.htmlparser.xom.XOMTreeBuilder
has these nu.validator
imports:
import nu.validator.htmlparser.common.DocumentMode;
import nu.validator.htmlparser.impl.CoalescingTreeBuilder;
import nu.validator.htmlparser.impl.HtmlAttributes;
It is the design intent of the parser that a third party is allowed to write the kind of wrapper that nu.validator.htmlparser.xom
is. Hiding nu.validator.htmlparser.impl
seems contrary to that goal. (Whether impl
is a good name is too late to debate: If we're not changing fully-qualified names, we're stuck with that name.)
Maybe I'm misunderstanding your question, but I doubt it's possible to decouple any piece of the parser from
java.xml
, since e.g.SAXException
is used in the public API of pretty much every package.
Good point. I had forgotten about that.
Additonal observation: Enabling normalization checking depends on ICU4J.
@carlosame, @sideshowbarker, do you see value in the common
/base
split that @anthonyvdotbe suggests?
do you see value in the
common
/base
split
base
module exports treebuilder
which is a new package of unknown contents, so nothing to comment about. I'd rather see the actual code to check whether this breaks backwards compatibility.common
package bytecode is barely 5K compressed, not even 9K uncompressed. Modules are intended to avoid classpath clutter, but I believe that a 5/9 Kb gain is not worth the extra effort to handle the additional module (with its jar file).And BTW exporting jchardet
goes in the wrong direction (while at the same time the proposal is not exporting io
nor the other packages).
do you see value in the
common
/base
split that @anthonyvdotbe suggests?
I don’t personally see value in it. But I’m not a domain expert around the fine points of packaging, and I don’t feel strongly enough about it that I’d object to it. That said, I have a general preference for keeping things as simple as possible.
Fair enough, so I think we can agree on the following:
nu.validator.htmlparser
, nu.validator.htmlparser.xom
, nu.validator.saxtree
Interner
et al.) & all packages are exported to maintain 100% backward compatibilityAnd have one open question: what with jchardet
?
I'm ok with any proposal (mine being: move it into this repo as an additional nu.validator.jchardet
module), as long as we don't have to rely on the automatic module name.
Fair enough, so I think we can agree on the following:
3 Java modules:
nu.validator.htmlparser
,nu.validator.htmlparser.xom
,nu.validator.saxtree
no classes are moved (i.e. forget about my proposal to move
Interner
et al.) & all packages are exported to maintain 100% backward compatibilityorganised in a single Git repo with a Maven multi-module project
OK.
And have one open question: what with
jchardet
? I'm ok with any proposal (mine being: move it into this repo as an additionalnu.validator.jchardet
module), as long as we don't have to rely on the automatic module name.
There's also another question whose answer might inform the answer to that question: What to do with the ICU4J normalization checking dependency. https://github.com/unicode-org/icu/blob/master/icu4j/manifest.stub says Automatic-Module-Name: com.ibm.icu
, so I assume nu.validator.htmlparser
would have requires static com.ibm.icu
. Correct?
In any case, I think jchardet
doesn't belong in this repo. I'd still like to avoid having to repackage it if possible.
@carlosame, can you corroborate the incompatibility of the existing jchardet
with jlink
?
can you corroborate the incompatibility of the existing
jchardet
withjlink
?
Both jlink
and jpackage
do not work well with automatic modules, however there is a tedious workaround using jdeps that I haven't tested myself. Two tools claim to do that process automatically for you though:
If you want to make this project more friendly to jlink
, the proper solution would be to create a jchardet
package inside the main htmlparser
module and do not export it, instead of creating a specific jchardet
module as @anthonyvdotbe proposes.
But in any case, if you ship jchardet
classes with your jar at some point you may find some guy using Java 8 that directly calls a jchardet
class from this project's jar file.
I assume
nu.validator.htmlparser
would haverequires static com.ibm.icu
I'd rather have requires com.ibm.icu
without the static
, which is what my initial patchset had and I should have not changed.
And needless to say that I see no point in having a separate saxtree
module as it has no realistic stand-alone use cases, my opinion has been already expressed several times (just making sure that I'm not a recipient for the "we can agree").
[...] so I assume nu.validator.htmlparser would have requires static com.ibm.icu. Correct?
Correct.
In any case, I think jchardet doesn't belong in this repo. I'd still like to avoid having to repackage it if possible.
Ok, let's continue using jchardet
as is, since there's workarounds for the problems it poses.
@carlosame are you planning to provide a PR for this? (just asking to avoid duplicate efforts)
@carlosame are you planning to provide a PR for this?
I already provided a modularization patch (without Maven modules) in July 5 (PR #23) and it hasn't received a lot of attention. Switching that PR to a modular Maven project (with htmlparser
and xom
modules) would not be difficult, but I'd rather not do that because the conversion of the repository to Maven modules (not JPMS modules) should be done by @hsivonen following instructions from the author of the modularization patches being applied later.
Maven modules involve a different directory layout (with a directory for each Maven module), so the files should be moved in a way that Git history is not completely lost (git mv
). Thus, IMHO Sivonen should be the author of the mv
if we want to mess with git blame
as less as we could.
Responding specifically to your question, I have no plans to provide a PR with a saxtree
module (and that rules me out given your agreement with @hsivonen about modularization). I suggest that you proceed with him to switch the repository to a Maven modular layout, and once you have achieved that you can add the module-info
's etc. (or at least that would be my suggestion on what to do next).
@hsivonen: as the de facto owner of this repository, I'd like to urge you to decide on the way forward, because I feel things are getting out of hand: @carlosame and I have different views on how to proceed, and we're both investing quite a bit of our spare time in implementing/defending our point of view. I believe that it's in everyone's interest for a final decision to be taken sooner rather than later. (I understand if modularization is not a priority for neither you nor your employer, but I believe it's not unreasonable to ask some of your time to settle this debate.)
1. Java modularization
There are plenty of options, starting from the most fine-grained modularization:
htmlparser.dom
->htmlparser.common
htmlparser.sax
->htmlparser.common
,saxtree
htmlparser.xom
->htmlparser.common
htmlparser
+saxtree
htmlparser
htmlparser
+htmlparser.xom
htmlparser
+htmlparser.xom
+saxtree
htmlparser.jaxp
+htmlparser.xom
+htmlparser.common
+saxtree
Q1: which combination makes most sense to you? (edit: please note that this question is about modules as a generic software architectural concept, not about Java modules specifically. So any Java technicalities can be disregarded when answering this question. I've merely named it "Java modularization" in the title to clarify that it's about the conceptual modularization of the code, not about how the code is organized in Maven modules or Git submodules or anything like that)
2. Project organization
Assuming there's at least 2 modules in the chose combination, there's multiple options on how to organize them:
(since @carlosame claimed that the first option would "complicate the workflow [...] and expose the project to a new category of Maven bugs": I claim the exact opposite, so please disregard any Maven-related concerns)
Q2: which option works best for you & @sideshowbarker?
3. upcoming releases
Edit: given that we agree that backward compatibility will be broken as needed to implement the decisions for 1. and 2., I'm fine with a 1.5 release as it is (except for the use of version ranges, but I'll argument that in the PR).
I'd like to propose that, no matter the answer to Q1 and Q2, version 1.5 is released with minimal changes, i.e. only changes that are required to resolve #17. In particular: no automatic module name, no changes to the dependency versions, ... Once 1.5 is published, we can then implement 1. and 2. as decided for a 2.0 release.Q3: do you agree with this proposal?Thanks in advance for reading & answering the
32 questions above, and thereby bringing back peace and quiet to this repository.Edit: the comment below demonstrates exactly why I'm saying it's in everyone's interest for a final decision to be taken sooner rather than later: there's a lot of frustration on both sides, and both sides feel the other is counterworking them. That's why I'm asking you to take an authoritative decision on this matter, so we can both accept your decision and move forward on implementing it.