Unit 3 - Vision Transformer: Transfer learning notebook dataset loads without decoding the images.

johko / computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord

MIT License

365 stars 121 forks source link

Unit 3 - Vision Transformer: Transfer learning notebook dataset loads without decoding the images. #311

Open benjamin-printify opened 3 weeks ago

benjamin-printify commented 3 weeks ago

The notebook expects a PIL image under each data point's 'image' key, but it actually loads a dict with 'path' and 'bytes' (the dataset is not decoded). This makes the notebook fail, starting from the cell in which some images are displayed.

Not sure this is the right solution but, by explicitly doing the decoding when loading the dataset solves it:

dataset = load_dataset('pcuenq/oxford-pets').cast_column('image', Image(decode=True))

(importing Image from datasets).