imagej / imagej-ops

ImageJ Ops: "Write once, run anywhere" image processing
https://imagej.net/libs/imagej-ops
BSD 2-Clause "Simplified" License
88 stars 42 forks source link

Rewrite OpSearchResults as signatures #644

Closed gselzer closed 2 years ago

gselzer commented 2 years ago

This PR rewrites OpSearchResults, with the goal to reduce the number of near-duplicate search results from the OpSercher.

The OpSignature

To reduce the number of duplicate search results, we'd ideally differentiate only by algorithm, folding all implementations of that algorithm into one search result. Our task, then, is to group Ops into sets, where each set corresponds to a different algorithm.

In ImageJ Ops, we can write implementations with overloading signatures; we then let the Op matcher choose the best implementation for our inputs. Thus the most consistent folding behavior would align with the folding behavior of the matcher. In other words, groups should share:

Ideally, we would also be able to fold secondary parameters into a single search result; this PR does not introduce that functionality into this PR as of writing. I avoided this functionality knowing that SciJava Ops does not differentiate between primary and secondary parameters. Implementing secondary parameter folding would likely result in inconsistent results between ImageJ Ops and ImageJ Ops2.

To make these conditions concrete, we introduce OpSignature, a class designed to encapsulate the information common to a set of implementations. Each OpInfo should be able to provide an OpSignature, and Ops that are different implementations of the same algorithm should provide the same OpSignature.

We also introduce the OpSignatureModule, which can execute any known Ops described by OpSignature given suitable inputs, and the OpSignatureInfo that can create the OpSignatureModule. These classes allow OpSignatures to be runnable in Fiji just like Ops and Modules, along with access to the pre- and postprocessor chain of the Context.

Searching for Ops

The OpSearcher currently returns one result for each implementation available containing the searched text in its name. With this PR, the results can easily be grouped by OpSignature. Simplified types are provided to the OpSearcher, allowing the folding to occur in an extensible way, with easy support for simplified type addition. For each group, only one OpInfo's OpSignature is chosen; this OpSignature is responsible for delegating to all Ops in the set.

This change will make searching for Ops in a graphical setting (e.g. Fiji, napari) much easier, and should be well reviewed.

TODO:

gselzer commented 2 years ago

@ctrueden do you have time to review this?

I think it's pretty clean, but would want to do more testing and get feedback before a merge.

ctrueden commented 2 years ago

do you have time to review this?

I put it on my list for Wednesday. Is that soon enough?

gselzer commented 2 years ago

do you have time to review this?

I put it on my list for Wednesday. Is that soon enough?

Yeah, I think that's fine!

gselzer commented 2 years ago

I was thinking about this over the weekend, and I wanted to record my thoughts:

OpSignature feels like the key to exposing/learning about Ops' functionality. It should be treated as a hint to the user saying "hey, I know there's an Op that looks like this!" Users should then build an OpRef aligning with a particular OpSignature, and that OpRef then gets matched to the OpInfo most tailored to the inputs.

Given OpSignature, though, one cannot currently assume there is an Op with the input types of the OpSignature. This is due to the way that inputs are currently folded, via OpSignature.reduce(). The goal of that method is to:

  1. Combine similar signatures
  2. Describe the input and output types in a way that is coherent for users.

E.g. let's say you have two Ops:

The user will only care about the fact that Ops has the functionality foo.bar(image, number); this is what we want to display to users, and OpSignature allows us to do it! But if you pass a Img<LongType> then you won't get a match (until ImageJ Ops2, at least), so you might come away thinking the OpSignature lied to you... :disappointed: But we don't have to reduce.

The difference between OpInfo, OpRef, and OpSignature:

ctrueden commented 2 years ago

@gselzer I like your succinct description of OpInfo vs OpRef vs OpSignature. Makes total sense. The only thing that makes me nervous is the name OpSignature. The term "signature" is usually used in computing to mean something for validating identity. Whereas in the context of this op reduction/simplification mechanism, that is not at all what it's about. So I would encourage you to rename it to something else. So then, what color should the bike shed be? I kind of like OpApproximation for its honesty. Or maybe OpStandardization, since the point is to standardize each argument to some baseline agreed-upon pseudo-types, right?

gselzer commented 2 years ago

The only thing that makes me nervous is the name OpSignature

I'm happy to debate names!

The term "signature" is usually used in computing to mean something for validating identity.

Sure, that meaning exists, but I think the meaning of inputs and outputs to a function is more common. This Wikipedia page is the first result I see when searching "computer science signature".

I kind of like OpApproximation for its honesty. Or maybe OpStandardization, since the point is to standardize each argument to some baseline agreed-upon pseudo-types, right?

I prefer the latter, since you are right about the point of this work. But approximation and standardization also have their own meanings.

I picked @elevans' brain, and he suggested OpProxy, which I like.

ctrueden commented 2 years ago

I am OK with OpProxy, although it feels a bit overly general to me. But I like it a lot more than OpSignature.

gselzer commented 2 years ago

Another name I like is OpListing, if we are viewing this thing as the thing that is presented to users

ctrueden commented 2 years ago

Ooh yeah, OpListing is :+1: from me.

gselzer commented 2 years ago

Ooh yeah, OpListing is +1 from me.

Alrighty, just renamed it.

gselzer commented 2 years ago

Here's another question I've been struggling with:

Should the OpListing contain parameter names? I can think of reasons for and against:

For:

Against:

ctrueden commented 2 years ago

@gselzer There are still WIPs in this PR. I will take care of it this time as part of my review, but please, when filing PRs, make a clean history.

gselzer commented 2 years ago

@gselzer There are still WIPs in this PR. I will take care of it this time as part of my review, but please, when filing PRs, make a clean history.

Thanks. My plan was to squash+merge, would it make review easier to squash now?

hinerm commented 2 years ago

@gselzer does this PR actually work for you in Fiji?

Here's what I get when I search for something: image

gselzer commented 2 years ago

@gselzer does this PR actually work for you in Fiji?

Here's what I get when I search for something: image

Dang, I see that in Ops too now, @hinerm. I'm guessing it's actually an issue in that Types utility class, but I'll see what I can do to mitigate it.

gselzer commented 2 years ago

Rewriting this line to be .filter(e -> Types.isAssignable(Types.raw(type), e.getKey())) fixes that issue. This is likely the correct fix.

But I was correct that this is a deeper issue within Types. It turns out that CaptureTypeImpls are propagating into the Types.isAssignable checks in the link above, due to the way that the CommandModuleItem determines the parameterized type of the inputs. CaptureTypeImpls have no support in Types.isAssignable, leading to the throw IllegalStateException here.

Do you agree that the Types.raw() addition is correct, @hinerm?

hinerm commented 2 years ago

Do you agree that the Types.raw() addition is correct, @hinerm?

@gselzer this seems reasonable to me!

CaptureTypeImpls have no support in Types.isAssignable, leading to the throw IllegalStateException here.

OK.. I'm assuming CaptureType is truly necessary despite the presence of WildcardType. If it doesn't already exist there should be an issue in SJC to add support in Types.isAssignable.

hinerm commented 2 years ago

@gselzer do you have local fixes for the issue I brought up that you can push, or is anything holding them up? How urgent is this PR to be merged before next week?

gselzer commented 2 years ago

@hinerm I was actually just about to push them :sweat_smile:

Just making sure that the tests pass with the changes

gselzer commented 2 years ago

@hinerm can you see if you prefer this change in Fiji?

hinerm commented 2 years ago

@gselzer Well there's no errors but it looks like the ops aren't actually being consolidated now..?

image

gselzer commented 2 years ago

I'm guessing we just need improvements to the the reduction map @hinerm.

For example, no primitves are getting reduced right now, as they aren't in the map. That's why we see so many math.add(img, value)-like op results.

I'll see what I can do.

EDIT: Starting off with primitives, we get: image

hinerm commented 2 years ago

@gselzer the search results are looking good!

But.. they may not actually be functional? For example, if I run math.add(number, number) nothing happens. image

I expected a particular PrimitiveMath.IntegerAdd signature to "win" and run (e.g. manually running add(int, int)): image

gselzer commented 2 years ago

For example, if I run math.add(number, number) nothing happens.

The issue is that we are getting stuck (silently :frowning:) in input harvesting.

Specifically, the AbstractInputHarvester is trying to create a org.scijava.widget.DefaultWidgetModel, which fails here since the DefaultModuleService cannot convert "0" into a Number. Previously, we were converting "0" into some primitive type, which the ConvertService must have been able to do.

Thus we should probably make the fix in scijava-common.

Unfortunately, I'm guessing that this isn't an isolated scenario and that we will see similar errors pop up.

gselzer commented 2 years ago

@hinerm so math.add runs now :sweat_smile:

The issue is that scijava-common had no support for Number, only for its subclasses. See scijava/scijava-common#441 for the fix that allows us to run that result. You can build it and install it in Fiji and things "work"

Unfortunately, the matching is wonky. I entered in "a=63.5" and "b=40" and got a result of "103". After some digging I found out that this seems to be a matching bug, as the Op IntegerAdd is being called for priority reasons :unamused:

But all of this work really seems beyond the scope of this PR; it's all bugs in upstream libraries.

I'm going to take another look at this PR tomorrow and see if we can't figure out any other issues with this code (or if there are any I missed, as I saw @ctrueden had some suggestions above)

hinerm commented 2 years ago

But all of this work really seems beyond the scope of this PR; it's all bugs in upstream libraries.

@gselzer I agree. Things look good and I also merged https://github.com/scijava/scijava-common/pull/441

the Op IntegerAdd is being called for priority reasons

My first intuition is that we need to decouple parameter harvesting from actually running the Op. So if the user picks math.add(number, number) we need to harvest two Number params and then use those to match and run the Op.

gselzer commented 2 years ago

My first intuition is that we need to decouple parameter harvesting from actually running the Op. So if the user picks math.add(number, number) we need to harvest two Number params and then use those to match and run the Op.

Yup, that's what happens (it happens here)

hinerm commented 2 years ago

Yup, that's what happens

Annoying! OK, then I'd say don't worry about it for now