Open NivekT opened 2 years ago
For TarArchiveReader
, should we add a deprecation warning in main branch as 0.3.0 branch cut has been finished.
Another Misc tracker: | Name | Module | Deprecation Version | Status | Earliest Removal Version |
---|---|---|---|---|---|
torch.utils.data.graph.traverse | Core | 1.13 | Deprecating | 1.15 / 2.1 |
I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?
I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?
@BlueskyFR
IIRC, we plan to remove this DataPipe in the future. The general reason is that we think this can be easily achieved by using a demux
based on file types then decode each datapipe correspondingly then mux
them together. Glad to hear your use case.
I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?
@BlueskyFR IIRC, we plan to remove this DataPipe in the future. The general reason is that we think this can be easily achieved by using a
demux
based on file types then decode each datapipe correspondingly thenmux
them together. Glad to hear your use case.
I don't understand: how should I proceed to decode a PNG image in the current state then?
You can use a map function like datapipe.map(decode_fn)
to decode the PNG image
You can use a map function like
datapipe.map(decode_fn)
to decode the PNG image
Okay, but why was support for decoding dropped then?
Okay, but why was support for decoding dropped then?
decoding
didn't do more things like a map
function, except we provided a few decoding functions for convenient. And, in order to support routed_decode
, we need to add lots of decoding functions to cover the general file decoding, which is not sustainable for us to maintain and it makes the routed_decode
more complicated and redundant. For example of your use case (decoding PNG), the routed_decode
would add more decoding handlers such as json
, pickle
, etc. into this DataPipe.
As, TorchData provides composable way to construct pipeline, users should be able to create a pipeline to handle specific decoding mechanism
Okay, but why was support for decoding dropped then?
decoding
didn't do more things like amap
function, except we provided a few decoding functions for convenient. And, in order to supportrouted_decode
, we need to add lots of decoding functions to cover the general file decoding, which is not sustainable for us to maintain and it makes therouted_decode
more complicated and redundant. For example of your use case (decoding PNG), therouted_decode
would add more decoding handlers such asjson
,pickle
, etc. into this DataPipe.As, TorchData provides composable way to construct pipeline, users should be able to create a pipeline to handle specific decoding mechanism
Okay. What is the preferred mechanism to decode images? Ideally I think it should be done in batches if performance is needed
Okay. What is the preferred mechanism to decode images? Ideally I think it should be done in batches if performance is needed
It depends on if your decode_fn
supports batched decoding in high performance (multithreading). Otherwise, I think it's going to be similar to do decoding per image.
We have a number of DataPipes that are being deprecated. Our general policy is that we first mark the DataPipe as deprecated with a warning, and wait at least one release cycle (~3 months) before removing it. Note that some DataPipes will be removed from the PyTorch Core library but will remain in TorchData, and some others are renamed.
Status Types:
DataLoader2
TrackerPrototypeMultiProcessingReadingService
->MultiProcessingReadingService
IterDataPipe
Trackeropen_file_by_fsspec
is Removedopen_file_by_iopath
is RemovedMapDataPipe
TrackerNothing for now
cc: @ejguan @VitalyFedyunin @NivekT