facebookresearch / dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.
Apache License 2.0
9.13k stars 812 forks source link

Simple questions about dataset preparation before training DINOV2 #107

Open EddieAy opened 1 year ago

EddieAy commented 1 year ago

@woctezuma Sorry to bother you again. But I just wanna know how can i get correct ImageNet-like dataset

-So do i need an /labels.txt? Like below and i don't have TEST SET image

-Although i don't know what are those files under 'extra', when i follow the below code image

Thank you @woctezuma

XiaohuJoshua commented 1 year ago

+1, it would be better if we can just uses a ImageFolder

onvungocminh commented 1 year ago

@EddieAy @XiaohuJoshua Did you solve the problem?

HDL-YD commented 1 year ago

I have the same doubt.

bruce-willis commented 8 months ago

There are several ways how you can train linear head for classification on your data:

  1. Downstream-Dino-V2 repository
  2. HF Transformers version of DINOv2 with the example notebook for the classification
  3. Modify this repository a little bit.

I will advocate for the last approach because it contains the training of many linear heads simultaneously with different parameters (learning rate, average pooling or not and different blocks from the backbone). I will do for the imagenette dataset, but you, of course, can use whatever you want.

Feel free to contact me if you have any issues.

PMRS-lab commented 2 months ago

@woctezuma Sorry to bother you again. But I just wanna know how can i get correct ImageNet-like dataset

* For example,when i train dinov1, it's simple and it just uses a ImageFolder

* Below picture is the dataset preparation process in **DINOV1**
  ![image](https://private-user-images.githubusercontent.com/101048881/240350680-cca88cae-dcb9-498a-94c7-dfe58a4f4d91.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ3NzQzOTIsIm5iZiI6MTcyNDc3NDA5MiwicGF0aCI6Ii8xMDEwNDg4ODEvMjQwMzUwNjgwLWNjYTg4Y2FlLWRjYjktNDk4YS05NGM3LWRmZTU4YTRmNGQ5MS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyN1QxNTU0NTJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0xNjFjMmQ5ZmE1MGE1ZjgyMTNlODdjYzNjYmMwZjczZGM4YjY4M2EwODFkYmJkNzBlNDUxMTMzZDA4MjRkZDZmJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.1N7YeTXofhQmPiwT7tne37p8qGNiHgKt63yBSVTRMoM)

* And this is my dataset structure. I don't need any other adjustment to this dataset when i try to train **DINOV1**. Because this is what ImageFolder likes
  ![image](https://private-user-images.githubusercontent.com/101048881/240351056-aa8cef2e-f15e-47c3-a914-c94ed77a4536.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ3NzQzOTIsIm5iZiI6MTcyNDc3NDA5MiwicGF0aCI6Ii8xMDEwNDg4ODEvMjQwMzUxMDU2LWFhOGNlZjJlLWYxNWUtNDdjMy1hOTE0LWM5NGVkNzdhNDUzNi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyN1QxNTU0NTJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kOWZmNWEzMDdkYTNhZTQxMzM1Y2JkNmM4MzIyMjdhMmUxZDBkZDFkMzRiY2Q4ZTk1NmI1MTc0ZWQzZTE5ZWQ4JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.rIbDNA2u48IivpN0-3ixm-yJe_DdmE-ZduGW0F2dhcs)

* BUT NOW, when it comes to dinov2,it shows that:
  ![image](https://private-user-images.githubusercontent.com/101048881/240351623-54d80d18-715f-47f0-b3e1-c408157b31f7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ3NzQzOTIsIm5iZiI6MTcyNDc3NDA5MiwicGF0aCI6Ii8xMDEwNDg4ODEvMjQwMzUxNjIzLTU0ZDgwZDE4LTcxNWYtNDdmMC1iM2UxLWM0MDgxNTdiMzFmNy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyN1QxNTU0NTJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT01NjBiNGIxOGEwOTYzM2ZmOWQ2YWY0OTYwMGNmMThmMTQ2ZWMwYjYwMjBkMTQ4ZTNlZDg5ZmMwZmFjYTU4ZTg1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.QJ4oLfV8kEbFdBRks1jcfvz4_SU45WJWTzZ6H8zxP1M)

-So do i need an /labels.txt? Like below and i don't have TEST SET image

-Although i don't know what are those files under 'extra', when i follow the below code image

* I got this.  So how can i correct?   How i get start to train dinov2?
  ![image](https://private-user-images.githubusercontent.com/101048881/240352539-956f6fcf-60c0-41e7-bdc2-a21cca3fe018.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ3NzQzOTIsIm5iZiI6MTcyNDc3NDA5MiwicGF0aCI6Ii8xMDEwNDg4ODEvMjQwMzUyNTM5LTk1NmY2ZmNmLTYwYzAtNDFlNy1iZGMyLWEyMWNjYTNmZTAxOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwODI3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDgyN1QxNTU0NTJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wZDc2ODJhNDNjOGJmNzE3OGFmN2ZiZThmZGM4ZjU3MDZmNzk5Njk0OTlhNTlmZjg3NTM2YTAyNjJkOTg3MDZhJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.b2q1Ij5Iun9u5U9RcSJyzN8rp2iYh9B1x-LPgDZUhUo)

* And this is my dataset structure

image

Thank you @woctezuma

Did you solve the problem?