Add flag to drop example from PredicitonResult

damccorm commented 2 years ago

RunInference currently returns both the example/input and prediction.


PredictionResult:
  example: torch(...)
  inference: torch(...)

Users may want the ability to only return the inference to minimize potential memory/serialization issues later in the pipeline. They can do this with a flag, and the return value may look like


PredictionResult:
  example: None
  inference: torch(...)

Imported from Jira BEAM-14362. Original Jira may contain additional context. Reported by: yeandy. Subtask of issue #21435

TheNeuralBit commented 2 years ago

Should we make returning just the inference the default and handle this with a composable ModelHandler that produces PredictionResult instances, similar to KeyedModelHandler?

Unfortunately that couldn't be done in a backwards compatible way.

CC: @robertwb

yeandy commented 2 years ago

Subtask of https://github.com/apache/beam/issues/22117

damccorm commented 2 years ago

Should we make returning just the inference the default and handle this with a composable ModelHandler that produces PredictionResult instances, similar to KeyedModelHandler?

Unfortunately that couldn't be done in a backwards compatible way.

My instinct is that the large majority of users are going to want both passed through and we should prioritize that need (and not making a breaking change) over making things a little cleaner for the just inference case.

The other advantage of always returning a PredictionResult is that it gives users more flexibility if, for example, they initially only need the inferences but later want to do something with the input itself. If we just return the inferences then they need to insert some intermediate mapping step to make their existing logic work, if we always return a PredictionResult then they can flip the flag, grab the initial input wherever they need it, and not touch anything else.

There's also a larger philosophical question of when we should introduce a new model handler vs parameterizing a new behavior option. I'd argue that because in this case omitting the input is essentially a subset of the current output (with no changes to the input) and we're not changing the graph structure we're better off treating this as an option (parameter to the RunInference call) than introducing a new handler.

AnandInguva commented 2 years ago

What would be the plan of action on this one? By looking at comments, PredictionResult will remain same. Are we adding an optional flag to exclude the input from the PredictionResult?

I guess if someone wants to exclude the outputs, they can do it using beam.Map() or a PostProcessor.

Edit: I guess if someone wants to exclude the inputs, they can do it using beam.Map() or a PostProcessor.

damccorm commented 2 years ago

Are we adding an optional flag to exclude the input from the PredictionResult?

Yeah, I think that's the best option here

I guess if someone wants to exclude the outputs, they can do it using beam.Map() or a PostProcessor.

I don't quite follow this - why would someone want to exclude the outputs? Do you just mean unnesting the outputs from the PredictionResult?

AnandInguva commented 2 years ago

Are we adding an optional flag to exclude the input from the PredictionResult?

Yeah, I think that's the best option here

I guess if someone wants to exclude the outputs, they can do it using beam.Map() or a PostProcessor.

I don't quite follow this - why would someone want to exclude the outputs? Do you just mean unnesting the outputs from the PredictionResult?

I am sorry. It's inputs not outputs

AnandInguva commented 2 years ago

.take-issue

AnandInguva commented 2 years ago

I ran few experiments on Dataflow Runner using Pytorch Model Handler

default PredictionResult vs PredictionResult with input = None.

I didn't notice a significant change in the resources. @damccorm what do you think?

Fiirst image: Default, second image: PredictionResult with input = None

damccorm commented 2 years ago

I think the difference largely will depend on workload. For our examples it is probably not significant because we're pretty much just writing the inference results immediately, but I can imagine if (a) your input size is significantly larger or (b) you're doing more than simple maps post-inference (especially if you're doing something that breaks pipeline fusion and causes work to be distributed to new workers), then the costs would be higher.

I'm generally still in favor of doing this as a small convenience for users since its cheap

AnandInguva commented 2 years ago

drop_example can be passed via ModelHandler or beam.RunInference. As of now, I added it to the RunInference Ptransform since it outputs the PredictionResult........but in the process method of the transform, it calls self._model_handler.run_inference() which returns PredictionResult so it could be passed via ModelHandler as well.

damccorm commented 1 year ago

@AnandInguva I'm reopening this one since we're trying https://github.com/apache/beam/pull/23356 instead.

apache / beam

Add flag to drop example from PredicitonResult #21444