fastai / fastai2

Temporary home for fastai v2 while it's being developed
https://dev.fast.ai
Apache License 2.0
645 stars 234 forks source link

Have the ability to add our own type_tfms on test_dl #522

Closed muellerzr closed 4 years ago

muellerzr commented 4 years ago

Is your feature request related to a problem? Please describe. When you build a new test_dl, if your data comes in an input not formatted like in training, it can cause headaches. IE my data may come in as images in folders but if instead they're linked to a DataFrame in my deployment platform, worming them around into a DataLoader involves overriding the original dataset's type transforms

Describe the solution you'd like The ability to pass in a dictionary of type transforms (or something along the lines that makes sense to override any of our transforms available), with the first being the index you want to override (such as the file and label, etc)

Describe alternatives you've considered Overriding the actual transforms like below:

p = Pipeline([ColReader('fname'), PILImage.create])
dls.valid_ds.tls[0].tfms = p
dls.valid_ds.tls[0].types.insert(0, pd.Series)

(this is just a dummy example taken from the PETs notebook).

The other issue needed to be taken into account is if our datatypes change. IE in the above example, fastai wasn't expecting a DataFrame, as originally it wasn't. So if we try to pass in our series we need to adjust the types

muellerzr commented 4 years ago

This issue (and workaround) is discussed more in-detail here:

https://muellerzr.github.io/fastblog/2020/08/10/testdl.html

jph00 commented 4 years ago

Let's leave this until after release. Please remind me if I forget though!

muellerzr commented 4 years ago

Sounds good Jeremy!