locationtech / jts

The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
Other
1.97k stars 442 forks source link

Musing about non-functional code cleanup #918

Open RichMacDonald opened 2 years ago

RichMacDonald commented 2 years ago

I have been using JTS for 15 years but have not contributed any code.

If it were "my code", there are a number of "low-hanging-fruit cleanups" I would do immediately, such as:

  1. Adding generics where it helps ... without becoming a PITA. )There are a few obvious places. Much of it is trivial, though. An eyesore at most. Not a source of bugs. I just like to clean up code and help readability.)

  2. Adding @Override to all the methods that do so. This is a single "do them all" command in Eclipse.

  3. Making a lot of classes final, to give the JIT compiler better options. In the past, conventional wisdom has been NOT to use final for performance/GC-reduction, but the JIT has made the old assumptions obsolete. (Even for JDK 8.) As a simple example, ArrayListVisitor does nothing but wrap an ArrayList, at the expense of a new instance. The JIT is now capable of inlining the entire class so that the ArrayListVisitor is never instantiated. (Without testing, I would not be surprised if the JIT does not instantiate ArrayListVisitor in most usages, but declaring it final makes the job easier for the JIT: As long as the class is not marked final, the JIT has to maintain "undo" code in case it ever loads another jar at runtime that contains a subclass of ArrayListVisitor.)

3a. Obviously, inlining ArrayListVisitor would not cause a measurable improvement. But classes like DistanceOp and even Coordinate (with difficulty) could also be turned into final classes. Perhaps this could eventually cause a significant improvement in performance.

  1. Convert to "enhanced for loops". Once again, a single "do them all" in Eclipse.

  2. I see uses of "Math.sqrt(dx dx + dy dy)" instead of the more recent "Math.hypot(dx, dy)". Intentional?

  3. A lot more interfaces instead of classes. I maintain a parallel set of Geospatial shape classes in my own project, and I have to create parallel instances for all the JTS work. If the Geometry class hierarchy were an interface hierarchy, a lot of unnecessary client work would go away. Maybe or maybe-not: This is a very open question whether or not interfaces would improve things. But I'd like to try.

I am an experienced programmer, so I understand "don't change what ain't broke". But JTS is brilliant code and it would be great to see it move out of its "early JDK look", no offense intended.

Of course, legacy industry code depending on JTS is a huge concern, so that would override many of these ideas. (Imagine turning Coordinate into an interface! And who knows which final candidate classes have been subclassed "in the wild".)

I have not seen any discussion of these sorts of ideas, so I wanted to know what others think. And what has been discussed in the past?

Thanks for all the hard work. Rich MacDonald

dr-jts commented 2 years ago

Good suggestions, Rich. Some comments below:

  1. Adding generics where it helps ... without becoming a PITA. )There are a few obvious places. Much of it is trivial, though. An eyesore at most. Not a source of bugs. I just like to clean up code and help readability.)

Yes, as long as they don't change public API (I think?)

  1. Adding @OverRide to all the methods that do so. This is a single "do them all" command in Eclipse.

Makes sense.

  1. Making a lot of classes final, to give the JIT compiler better options. In the past, conventional wisdom has been NOT to use final for performance/GC-reduction, but the JIT has made the old assumptions obsolete. (Even for JDK 8.) As a simple example, ArrayListVisitor does nothing but wrap an ArrayList, at the expense of a new instance. The JIT is now capable of inlining the entire class so that the ArrayListVisitor is never instantiated. (Without testing, I would not be surprised if the JIT does not instantiate ArrayListVisitor in most usages, but declaring it final makes the job easier for the JIT: As long as the class is not marked final, the JIT has to maintain "undo" code in case it ever loads another jar at runtime that contains a subclass of ArrayListVisitor.)

3a. Obviously, inlining ArrayListVisitor would not cause a measurable improvement. But classes like DistanceOp and even Coordinate (with difficulty) could also be turned into final classes. Perhaps this could eventually cause a significant improvement in performance.

Good idea

  1. Convert to "enhanced for loops". Once again, a single "do them all" in Eclipse.

Good idea.

  1. I see uses of "Math.sqrt(dx dx + dy dy)" instead of the more recent "Math.hypot(dx, dy)". Intentional?

Not intentional, just old code.

  1. A lot more interfaces instead of classes. I maintain a parallel set of Geospatial shape classes in my own project, and I have to create parallel instances for all the JTS work. If the Geometry class hierarchy were an interface hierarchy, a lot of unnecessary client work would go away. Maybe or maybe-not: This is a very open question whether or not interfaces would improve things. But I'd like to try.

This is a goal of JTS 2. It's a big API change though, which is why it hasn't been done.

Of course, legacy industry code depending on JTS is a huge concern, so that would override many of these ideas. (Imagine turning Coordinate into an interface! And who knows which final candidate classes have been subclassed "in the wild".)

Yes, backwards compat is definitely a concern. As you say, adding final will have to be done carefully - or perhaps that's too risky to do.

If you want to submit PRs for some of these, that will allow a more focussed discussion.

RichMacDonald commented 2 years ago

I forked and added most of the Eclipse code refactorings. I did it in stages, in case it helps. My final cleanup can be found at https://github.com/RichMacDonald/jts/tree/rjm-organize-imports. Look for the eclipse preferences file at \build-tools\src\main\resources\jts\eclipse-java-preferences.xml.

Note that this is a very aggressive change, and it is likely you will disagree with some choices. My point is rather to show "what if", to give the principle developers ideas. Once people have decided which settings they like, it is very fast to start over from master.

As for simple generics cleanup, I did a quick branch at https://github.com/RichMacDonald/jts/tree/rjm-generics001. This would be suitable for a pull request, if you concur I am doing things correctly.