Open vfdev-5 opened 3 years ago
Hi @vfdev-5 ,
Thanks for raising this ticket.
The idea is that: currently, MONAI post transforms(independent with ignite) execute on the engine.state.output
dict, the post transforms can't get some useful information in engine.state.batch
, for example, the meta_dict of images or some transform context we put in the engine.state.batch
. As we discussed in another ignite ticket before, I think we need an end-to-end
dict, which is something like "combine engine.state.batch
and engine.state.output
", maybe it's better to use only 1 dict: engine.state.data
instead. Then the data flow can run through all the components and no need to write 2 lambda funs (batch_transform
and output_transform
) in handlers.
I didn't find any benefit to split the data flow into state.batch
and state.output
instead of only 1 state.data
.
Thanks in advance.
Hi @vfdev-5 ,
May I know do you guys have any updates on this topic? We start to develop MONAI v0.6 now, making a clear data flow for all the components is a very important topic for v0.6. We plan to use a overall dict to set/get data for all the components, like _data = {image: ..., imagemeta: ..., label: ..., pred: ...}: (1) image readers read medical image content and the meta data, (2) transforms operate image and meta data, (3) post transforms can do logic based on prediction and some input mask data, (4) saver handlers can get input image file names from meta data and save the prediction, etc.
If ignite can provide end-to-end sharable data in engine.state
or combine batch
and output
, that would be great.
Otherwise, we need to combine batch
and output
in MONAI and drop all the usage of engine.state.batch
, I think maybe it's not a good idea as it may be against ignite design for some unseen cases.
Thanks.
Hi @Nic-Ma
We haven't yet advanced on the topic, unfortunately. When I was thinking about it, a certain blockers came in mind as we do not force user to structure the batch or output = we do not know what is inside batch
or output
. So, minimally we can introduce engine.state.data
as a dictionary like:
engine.state.data = {
"batch": engine.state.batch,
"output": engine.state.output,
}
Do you think we could specify them more using some tricks ?
Hi @vfdev-5 ,
In my simple mind, the easiest way for MONAI is to change this line: https://github.com/Project-MONAI/MONAI/blob/master/monai/engines/trainer.py#L160 into:
engine.state.output = dict(engine.state.batch)
engine.state.output.update({Keys.PRED: preds, Keys.LOSS: loss})
Then we can use engine.state.output
in post transforms and handlers as the overall data flow, and don't use engine.state.batch
anymore.
Is there any drawback?
Thanks.
engine.state.output = dict(engine.state.batch)
engine.state.output.update({Keys.IMAGE: inputs, Keys.LABEL: targets})
I think that can be reasonable.
We can also think about unrolling batch and output (if sequence or mapping) into data like that:
engine.state.data = {}
if isinstance(batch, Sequence):
engine.state.data.update({f"input_{i}": value for i, value in enumerate(batch)})
elif isinstance(batch, Mapping):
engine.state.data.update(batch)
else:
engine.state.data["batch"] = batch
if isinstance(output, Sequence):
engine.state.data.update({f"output_{i}": value for i, value in enumerate(output)})
elif isinstance(output, Mapping):
engine.state.data.update(output)
else:
engine.state.data["output"] = output
Hi @vfdev-5 ,
Thanks for your confirm, let me move forward in MONAI with engine.state.output = dict(engine.state.batch)
first.
And I think your proposal of engine.state.data
can't work for our case, we don't want post transforms to depend on ignite, so we can't pass engine.state.data
to post transforms to parse data from ignite keys batch
and output
, so I prefer to combine data in engine.state.output
and just pass it to post transforms, then post transforms only need to handle a regular input dict.
I will ask your review when my first PR is ready.
Thanks.
Based on our online discussion with @vfdev-5 , we have a draft idea for dataflow:
batch
dict for model input and output
dict for model output anymore. Transforms can read / write data dict. Lifetime is only 1 iteration.Thanks.
Hi @vfdev-5 ,
Do you have any updates about your ideas or plans for this feature request?
Thanks in advance.
Hi Nic,
Thanks for pinging! We haven't yet thought about the implementation for this feature.
I have few comments on your previous comment : https://github.com/pytorch/ignite/issues/1923#issuecomment-847142939 (by the way, thanks a lot for the summary)
2) State with lifetime of 1 run. Currently, State is persistent and its lifetime is same as the one of the Engine. This is due to resuming feature we are currently supporting and will improve in future versions. The use-case is the following:
trainer = ...
trainer.run(data, max_epochs=10) # starts from 0 up to 10 epochs
trainer.run(data, max_epochs=20) # starts from 10 up to 20 epochs
3) I think we can rather easily provide something as here : https://github.com/pytorch/ignite/issues/1923#issuecomment-820557317
# At initialization and before GET_BATCH_STARTED:
engine.state.data = {}
# on GET_BATCH_COMPLETED:
if isinstance(batch, Sequence):
engine.state.data.update({f"input_{i}": value for i, value in enumerate(batch)})
elif isinstance(batch, Mapping):
engine.state.data.update(batch)
else:
engine.state.data["batch"] = batch
# on ITERATION_COMPLETED:
if isinstance(output, Sequence):
engine.state.data.update({f"output_{i}": value for i, value in enumerate(output)})
elif isinstance(output, Mapping):
engine.state.data.update(output)
else:
engine.state.data["output"] = output
What do you think ?
Hi @vfdev-5 ,
Thanks for your proposal!
I think it looks good to me. Would it be available in ignite v0.4.5 release?
@wyli, do you have any other comments? I think it's a good enhancement for ignite and we can update our MONAI components to only use engine.state.data
later.
Thanks.
🚀 Feature
Following the discussions from https://github.com/Project-MONAI/MONAI/issues/1987 , we may think of providing a unified data structure where we could merge input batch and output to simplify data access inside user handlers.
@Nic-Ma could you please detail the pitch of the idea. Thanks a lot !