Reflection for version 2.0

willcohen commented 6 years ago

A reflection warning crept back into CRS when fixing compatibility for older clojure versions. It makes me think it may be worth fixing the test suite's reflection warnings after all, so new warnings don't get lost in the noise. This PR fixes the new CRS reflection, and then uses hinted helper functions so that the interop functions in the tests themselves don't each need hints.

Separately, there's one remaining reflection warning in flare (a midje dependency), but it hasn't been updated in years, and so it seems unlikely that midje will be able to eliminate it any time soon. Seeing as how CircleCI has moved away from midje, would you all be interested in moving the tests back to clojure.test? Midje isn't doing anything too fancy here, so I suspect it could be a pretty straightforward move.

worace commented 6 years ago

These seem like 2 separate questions to me -- one of dealing with reflection warnings in our dev/test environments, and one of whether to switch from midje to clojure.test.

I think we may be getting a little carried away on the reflection-chasing, and maybe we have been over-emphasizing that in some issues or code reviews. Clojure is a dynamic language after all, and in some scenarios reflection is just going to be how it works. We definitely don't care about reflection in our test dependencies since we're not really concerned about performance there. If the reflection warnings in the test suite are getting too noisy we can just take them out, and possibly replace them with some separate task that we run manually and more sparingly. We should keep in mind the main benefit of reflection warnings, which is to help avoid repeated dynamic overhead in situations like tight inner loops where it potentially adds up a lot. I don't think we need to be overly zealous about chasing down every single occurrence, especially if they aren't in hot code paths. The "best practices" section in this article seems relevant here: http://clojure-goes-fast.com/blog/performance-nemesis-reflection/.

On the midje vs clojure test question: AFAIK the tests in this project have never been on clj test, so there wouldn't really be any going back to it per se. I'm somewhat sympathetic to the idea of using the simplest tool possible, so in that sense clojure.test does have some appeal. However there are a few key features of midje that have pushed me more toward it over time:

Midje has a nicer (IMO) repl workflow, as evaluating a midje fact form immediately evaluates the test cases within it, which makes it easier to quickly evaluate single tests you're interested in. With clojure.test afaik deftest is just a macro for defining the test, and to evaluate it you need to then manually call the function it generates for you, or call (run-tests) to run thema ll
Midje has a pretty solid auto-test tool built in via lein midje :autotest. lein-test-refresh makes a solid attempt at adding this into clojure.test, but in my experience it has always been a little flaky and seems to require frequent restarts if stuff goes wrong.
The midje matchers, while I don't use them all the time, are fairly useful when you need them. Especially stuff like n-of, in-any-order, etc can be fairly handy for quickly describing more complex collection results.

Number 1 and 2 are especially valuable to me for enabling a fast and iterative TDD workflow, and I've never really been able to recreate them as effectively using clojure.test. If there are good techniques for getting around these issues I'd be curious to check them out.

I'm personally not overwhelmingly swayed by the CircleCI example. That article is almost 5 years old and lots of folks are still using midje these days, and on top of that their use-case of a very large, very complex server-side application (presumably with a large number of contributors as well) is pretty different from ours.

Finally something like this does introduce a fair amount of churn, so I'd be reluctant to do it unless there was a compelling benefit. I realize you have already done the work in #34, but it has the unfortunate side effect of obscuring much of the version history for anyone working within those files.

willcohen commented 6 years ago

Thanks for thinking about all this! I agree that it's good not to get too carried away -- I guess I was just taking the "when writing a library, maybe" advice in the best practices section of the clojure-goes-fast article too much to heart.

Happy to ignore switching from midje (and therefore, #34) -- I figured I might as well just run through the idea in a separate PR to see how disruptive it would be rather than just mention it in an issue and leave it at that. Points 1-3 are quite valid -- they haven't bothered me in my own workflow, but my personal preferences are exclusively my own.

With all that in mind, how would you like to proceed with this particular two-commit PR? The first commit fixes a new reflection that may happen up to twice every time create-transform is called, which might be a hot loop for users converting stuff, so I do still think it's worth fixing. Much of the reflection being fixed in the second commit is partially my fault and it feels like a 'regression' in terms of warning noise, so to speak -- while some of the tests in JTS predate all the big changes for 2.0, all the interop warnings related to the SRIDs and reprojection is new and arguably due to my laziness in test writing, but it is less important for the end user.

Addtionally, in actually writing out the tests more literally in #34, I did notice one thing that may be good to think about before 2.1, re epsg-str? and proj4-string?:

proj4-string? is a clear predicate kind of function and returns clean true/false, while epsg-str? returns truthy/falsey things that are either nil or filled with content. Might it be better to keep it true/false as well? If so, I suppose this would mean that the current epsg-str? would be an internal helper function so that it could still be used for the conversion function or be the test to return true/false for a clean predicate.
Should the functions themselves have a consistent naming scheme -- either changing proj4-string? to proj4-str? or epsg-str?/epsg-str->srid to epsg-string??

willcohen commented 6 years ago

Thinking more about this potential hot loop during transformation, there's definitely a built-in inefficiency that I ignored before for simplicity.

Let's say we've got a whole bunch of geometries in one projection to be transformed to another one -- all the features from a shapefile or geojson or whatever. Right now the way to do this is to map transform-geom onto them all, which is going to generate a create-transform for each individual geometry.

If we modified the 2-arity call to transform-geom from [g crs] to something like [g target], we could accept target as being one of two things:

If it's a CRS, proceed as currently written.
If it's a pre-existing transform object (generated once using create-transform), apply a new arity version of tf that runs on an existing transform instead of generating a new one based on the two systems. This would greatly speed up transforming large datasets like, say, every census block in a state and reduce needing to generate many many identical transform objects. I'm pretty sure we can still structure it so that it tries the existing functionality first and only jumps to try it as a transform if that fails, so that there shouldn't really be additional overhead for existing usage. The downside here is that it does mean that when locationtech's proj4j finishes incubating, the new namespace of these transform objects will mean the API breaks and geo will need another major version bump, but I think that's technically been true already since create-transform isn't private.

If this seems useful, I can try to work on this in a new PR.

willcohen commented 6 years ago

@worace Thanks a bunch for the 2.1 release, and for all of your help in getting all of these pieces integrated into geo! (and sorry for the delay, was travelling the last two weeks).

I simplified this PR to stick to simply fixing the new reflection in starts-with? and includes?, rather than messing with all of the testing changes, and rebased to master.

worace commented 6 years ago

@willcohen sorry, I've been putting my time into other projects recently so haven't been looking at this as much. Was glad to get the 2.1 release out, thanks again for your work on that. We have been using the H3 stuff some on a project and it is great. This looks good as well.

Factual / geo

Reflection for version 2.0 #33