Closed HCA97 closed 10 months ago
I have doubts it will improve our score but still.
I helped them at least. I will try to create Kaggle datasets out of the two from OverWhelmingFit. I think it will help!
I annotated and uploaded the data to Kaggle. I shared it with you. Again, I use bboxes that are the size of the image.
I am training a CLIP model with both the challenge and the inat data now.
Hi,
We got a worse result, aaaaaaaaaa
I will try to quickly test this one https://discourse.aicrowd.com/t/external-dataset-notice-on-usage-declaration/8999/4?u=hca97
Hi,
We got a worse result, aaaaaaaaaa
I will try to quickly test this one https://discourse.aicrowd.com/t/external-dataset-notice-on-usage-declaration/8999/4?u=hca97
Yes, only 0.772 f1-score.
http://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit/issues/136
Yes, maybe using the cleaned version helps! I will also try to do another training tonight!
@fkemeth I am having some errors with the submission
Hangs in like this
./submit.sh yolo-v8-s-classic-vit-l-14-ema-lux-dataset-7
Making submission as "hca97"
Checking git remote settings...
Using gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit as the submission repository
Updated Git hooks.
Git LFS initialized.
On branch hca
nothing to commit, working tree clean
or
git push --set-upstream origin hca
Uploading LFS objects: 100% (21/21), 34 GB | 4.9 MB/s, done.
client_loop: send disconnect: Broken pipe
send-pack: unexpected disconnect while reading sideband packet
Enumerating objects: 24, done.
Counting objects: 100% (24/24), done.
Delta compression using up to 16 threads
Compressing objects: 100% (18/18), done.
fatal: the remote end hung up unexpectedly
nothing to commit, working tree clean fatal: unable to access 'https://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit.git/': The requested URL returned error: 504
Hi @HCA97 ,
fatal: unable to access 'https://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit.git/': The requested URL returned error: 504
I had the same error. I needed to create a new SSH key pair, then it worked again. The other two errors I did not observe. I thought it was due to my key pair being expired - but maybe there was some different issue/changes.
@fkemeth I have generated a new SSH key but I am still having the same error. I can pull or commit my changes but I cannot submit them.
if you can able to submit could you submit my solution it is in hca branch
Okay, I managed to resolve the issue, it seems like there is something wrong with submit.sh
,
We can either create a new tag in the UI or by git. But tag must start with submission-
Did you manage to resolve it?
I never used the submit.sh. I always did something like
git tag -am "submission-abc" submission-abc
git push origin submission-abc
which triggered the submission (I commited and pushed the changes to master before that).
ooo, I always used the submission script. Yes, it is resolved.
In the inference script, we have
image_cropped = image[bbox[1] : bbox[3], bbox[0] : bbox[2], :]
I am not sure about the Yolo format, but shouldn't it be
image_cropped = image[bbox[0] : bbox[2], bbox[1] : bbox[3], :]
as with [xmin, ymin, xmax, ymax]?
Finally, we increased our score (I think the dataset from Lux is pretty good), but it takes more than 3 hours to train!
http://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit/issues/137
Unfortunately, I forgot to set the warm_up_steps
so we used 2000 instead of 1000 but I doubt switching to 1000 will improve our score. (training code https://github.com/HCA97/Mosquito-Classifiction/blob/yolo_and_more_testing/mosquito_clf_yolo_lux_ema.py)
n_classes: 6
model_name: ViT-L-14
dataset: datacomp_xl_s13b_b90k
freeze_backbones: false
head_version: 7
warm_up_steps: 2000
bs: 16
data_aug: hca
loss_func: ce
epochs: 15
label_smoothing: 0.1
hd_lr: 0.0003
hd_wd: 1.0e-05
img_size:
- 224
- 224
use_ema: true
use_same_split_as_yolo: false
shift_box: false
max_steps: 60000
wow, what a nice thing is I am getting the same score with the local evaluation!
I think using YOLO annotations is better than using challenge annotations. Because during the inference we use YOLO annotations.
wow, 0.856 f1 score. What did you change?
I uploaded the YOLO annotations in the Kaggle dataset:
They are from our first YOLO model, ideally, I want to use the baseline model (provided one) to reduce our complexity.
Btw bounding box location is in float don't forget to cast to int
Finally, we increased our score (I think the dataset from Lux is pretty good), but it takes more than 3 hours to train!
http://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit/issues/137
Unfortunately, I forgot to set the
warm_up_steps
so we used 2000 instead of 1000 but I doubt switching to 1000 will improve our score. (training code https://github.com/HCA97/Mosquito-Classifiction/blob/yolo_and_more_testing/mosquito_clf_mosq_lux_ema.py)n_classes: 6 model_name: ViT-L-14 dataset: datacomp_xl_s13b_b90k freeze_backbones: false head_version: 7 warm_up_steps: 2000 bs: 16 data_aug: hca loss_func: ce epochs: 15 label_smoothing: 0.1 hd_lr: 0.0003 hd_wd: 1.0e-05 img_size: - 224 - 224 use_ema: true use_same_split_as_yolo: false shift_box: false max_steps: 60000
Would you mind sharing the tensorboard loss curves as well? maybe we see how we should change hyperparameters from there.
In the inference script, we have
image_cropped = image[bbox[1] : bbox[3], bbox[0] : bbox[2], :]
I am not sure about the Yolo format, but shouldn't it be
image_cropped = image[bbox[0] : bbox[2], bbox[1] : bbox[3], :]
as with [xmin, ymin, xmax, ymax]?
I think x and y are in cartesian coordinate, so in rows in the image is the y-axis, and columns in the image is the x-axis.
Would you mind sharing the tensorboard loss curves as well? maybe we see how we should change hyperparameters from there.
Here: https://drive.google.com/drive/folders/1wd3FNpi8a3KRMVFykWp-U6Xr017apAl6?usp=sharing
I feel like our best solution very similar with the Luxs solution
I feel like our best solution very similar with the Luxs solution
Yes, he has the same score. Do you know which data the others use in addition? Lux shared the cleaned inat data right? With the uncleaned version I got not so good results. Maybe the others cleaned their respective data as well.
Probably, I don't know how it can work with noisy data.
@fkemeth bad news I break my ubuntu :) i suspect thanks to nividia drivers! they somehow updated. and the bad thing is i cannot enter bios or boot in recovery mode. idk why i cannot enter them.
i had an experiment idea train the model using different data augmentation either happy whale or imagenet then if they performm decent more than .83 then do weight ensembly like https://github.com/HCA97/Mosquito-Classifiction/issues/4#issuecomment-1676440367
I trained with the lux data yesterday, but got only an f1-score of 0.8. Did you train with both folders in Lux? Did you do upsampling still? If so, only for the challenge data?
On Kaggle it takes a while to train the models, I fear I won't be able to train three different ones. Are you able to train on Kaggle still? Then we could split that up.
I was also wondering if it would make sense to use the model from you and fine tune the last layer towards f1 score. What do you think? Do you know where I can find it?
I hope you can get your Linux fixed! I am feeling with you, I also had issues with updates and Nvidia drivers a while ago, but for me bios still worked.
Hmm, what are your parameters?
Finally, we increased our score (I think the dataset from Lux is pretty good), but it takes more than 3 hours to train!
http://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit/issues/137
Unfortunately, I forgot to set the
warm_up_steps
so we used 2000 instead of 1000 but I doubt switching to 1000 will improve our score. (training code https://github.com/HCA97/Mosquito-Classifiction/blob/yolo_and_more_testing/mosquito_clf_yolo_lux_ema.py)n_classes: 6 model_name: ViT-L-14 dataset: datacomp_xl_s13b_b90k freeze_backbones: false head_version: 7 warm_up_steps: 2000 bs: 16 data_aug: hca loss_func: ce epochs: 15 label_smoothing: 0.1 hd_lr: 0.0003 hd_wd: 1.0e-05 img_size: - 224 - 224 use_ema: true use_same_split_as_yolo: false shift_box: false max_steps: 60000
I beside from our defaullt parameters (except warm_up
steps) used Exponantial Moving Average aswell
This is my validation data:
The training data is merge of all the Lux's data (both folders) and the training data:
Did you do upsampling still? If so, only for the challenge data?
After I merge them I do sampling. So for both datasets.
Note that both of the validation and training bounding box annotations from YOLO not from challenge
I was also wondering if it would make sense to use the model from you and fine tune the last layer towards f1 score. What do you think? Do you know where I can find it?
Make sense maybe fine tune the last layer with nosier bounding box annoations.
On Kaggle it takes a while to train the models, I fear I won't be able to train three different ones. Are you able to train on Kaggle still? Then we could split that up
I think no time now, I can start look into it after 4-5 PM CET, I don't think that will be enough time. Anyway current result is still good as long as we didn't overfit to Public LB we are good.
I hope you can get your Linux fixed! I am feeling with you, I also had issues with updates and Nvidia drivers a while ago, but for me bios still worked.
Unfortunately, Linux always have problem with drivers. So annoying to deal with. But I guess nothing we can do.
Hi @HCA97 ,
I did one epoch of head fine-tuning with the ce+f1 loss. The resulting model did not change much though,
epoch=0-val_loss=1.2470133304595947-val_f1_score=0.8503904938697815-val_multiclass_accuracy=0.8767501711845398.ckpt
Will try to train another model layer with larger learning rate, less data augmentation tonight
This is the notebook https://www.kaggle.com/code/fkemeth/pho-experimentation/edit/run/147309771
Good news, I think I fixed my PC. it was a kernel issue (needed to upgrade the kernel) that affected the Nvidia driver. And apparently, my keyboard doesn't detect during the bootup. I used an old keyboard of mine and managed to boot it in recovery mode.
I will run experiments with different data augmentation tonight and try to merge their results tomorrow.
Good to hear it is working again!
I trained a few more models, one got a bit better score on the val data
val_f1_score=0.8680818676948547
I will submit it now to see if holds on the challenge data as well.
:crossed_fingers:
Good morning, My experiments failed because after a while I got a black screen and all the training stopped. I will look into those issues now.
Good to hear it is working again!
I trained a few more models, one got a bit better score on the val data
val_f1_score=0.8680818676948547
I will submit it now to see if holds on the challenge data as well.
Could you use the https://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit/-/blob/921be0e22fc63178c531f6d73067d926e5ff5b69/my_models/yolo_model_weights/best-yolov8-s-classic.pt YOLO model when you submit your solution.
Yes, I updated my code! I should have merged your PR on gitlab!
How is it going with your experiments?
bad i still have problem with my system. I updated the nvidia driver from 525 to 530 issue still persists, i think i need to re install ubuntu, so I stop experimenting :)
Good luck to it! I
I also could not get a better model.
anyway i think we did good a job
Yes, it was actually more fun than I expected. Thanks for reaching out!
I hope you learned as much as I did. Let me know if at any time you want to team up again, maybe we can build on the code/knowledge we got in this challenge.
Considering the time we could spend on it, fifth place is great in my opinion.
Hi @fkemeth,
Yes, it was actually more fun than I expected. Thanks for reaching out!
Thank you for partnering with me as well.
I hope you learned as much as I did. Let me know if at any time you want to team up again, maybe we can build on the code/knowledge we got in this challenge.
I also learned new things, and I believe some of them will be really useful in the future:
Considering the time we could spend on it, fifth place is great in my opinion.
I agree. We didn't have powerful hardware or a lot of free time. If I didn't have some problems with my system I think we could have gotten a few more points higher.
I will close this issue. Using Lux's dataset improved our score significantly.
https://discourse.aicrowd.com/t/external-datasets-used-by-participants/9217