johko / computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
MIT License
432 stars 136 forks source link

LoRA-Image-Classification Notebook , dataset type is different while downloading #322

Closed the-ray-kar closed 1 week ago

the-ray-kar commented 2 months ago

In the LoRA-Image-Classification Notebook from the course when running on Google Colab after importing the dataset using dataset = load_dataset('pcuenq/oxford-pets') The dataset structure is {'path': '/data/datasets/magic-ml/oxford-iiit-pet/images/Siamese_137.jpg', 'label': 'Siamese', 'dog': False, 'image': {'bytes': b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\x08\x06\x06\x07\x06\x05\x08\x07\x07\x07\t\t\x08\n\x0c\x14\r\x0c\x0b\x0b\x0c\x19\x12\x13\x0f\x14\x1d\x1a\x1f\x1e\x1d\x1a\x1c\x1c $.\' ",#\x1c\x1c(7),0 I think the data is not in PIL format while the code in Notebook expects in PIL format at different places for example here def show_samples(ds,rows=2,cols=4): samples = ds.shuffle().select(np.arange(rows*cols)) fig = plt.figure(figsize=(cols*4,rows*4)) for i in range(rows*cols): img = samples[i]['image'] and from PIL import Image def transforms(batch): batch['image'] = [x.convert('RGB') for x in batch['image']] inputs = processor([x for x in batch['image']],return_tensors='pt') inputs['labels']=[label2id[y] for y in batch['label']] return inputs dataset = dataset.with_transform(transforms)

I solved it using `def transforms(batch):

Convert bytes to PIL images

batch['image'] = [Image.open(BytesIO(x['bytes'])).convert('RGB') for x in batch['image']]  
inputs = processor([x for x in batch['image']],return_tensors='pt')
inputs['labels']=[label2id[y] for y in batch['label']]
return inputs`

I dont know whether other people get similar error or not in Google colab

kawchar85 commented 3 weeks ago

I've encountered a similar issue and resolved it using a similar approach. You can check more details here: https://github.com/johko/computer-vision-course/issues/312#issuecomment-2294832966

the-ray-kar commented 1 week ago

I've encountered a similar issue and resolved it using a similar approach. You can check more details here: #312 (comment)

Yes. That's it :)