Closed spacycoder closed 3 months ago
thank you, updated the files , please check.
Thanks! But I still think there are some issues. Now the object365_train_panseg.tar contains ~166 478 files but I can only open ~63 000 of them, the rest are just symlinks. Should I just ignore the rest?
Thanks! But I still think there are some issues. Now the object365_train_panseg.tar contains ~166 478 files but I can only open ~63 000 of them, the rest are just symlinks. Should I just ignore the rest?
could you please provide a few image ids for me to check? I should copy all the panseg data into one folder
Here are a couple of files:
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v2_01892927.png'): can't open/read file: check file path/integrity
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v2_01693633.png'): can't open/read file: check file path/integrity
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v2_01872366.png'): can't open/read file: check file path/integrity
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v1_00331480.png'): can't open/read file: check file path/integrity
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v2_01685998.png'): can't open/read file: check file path/integrity
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v1_00334198.png'): can't open/read file: check file path/integrity
[ WARN:0@158.066] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v1_00360979.png'): can't open/read file: check file path/integrity
[ WARN:0@158.072] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v2_01865393.png'): can't open/read file: check file path/integrity
[ WARN:0@158.072] global loadsave.cpp:241 findDecoder imread_('object365_train_panseg/objects365_v2_01863335.png'): can't open/read file: check file path/integrity
This shows one of the missing files has type "symbolic link":
stat object365_train_panseg/objects365_v2_01892927.png
File: object365_train_panseg/objects365_v2_01892927.png -> /mnt/bn/bytenas-lq-dxq/zipfile_panseg_copy/6/objects365_v2_01892927.png
Size: 71 Blocks: 8 IO Block: 4096 symbolic link
Device: fc01h/64513d Inode: 113785742 Links: 1
This shows one of the files that works and has type "regular file"
stat object365_train_panseg/objects365_v1_00581444.png
File: object365_train_panseg/objects365_v1_00581444.png
Size: 3371 Blocks: 8 IO Block: 4096 regular file
Device: fc01h/64513d Inode: 113553820 Links: 1
Also I think relabeled-coco and coconut-val on huggingface are the same. They both contain ~5000 files
Also I think relabeled-coco and coconut-val on huggingface are the same. They both contain ~5000 files
sorry, I uploaded the wrong folder as in my dataset they all shared the same name, now it is fixed for coconut_val, for large, it is strange as we annotation extra 6k images merging together, you can ignore these 6k, but I will fix it soon. thanks for the issue.
The coconut_val dataset contains symlinks. Maybe you should add the "--dereference" option when you tarball the folder? reference
The coconut_val dataset contains symlinks. Maybe you should add the "--dereference" option when you tarball the folder? reference
thanks for the issue, updated.
Great, thanks!
in the huggingface coconut_l there are a lot of symlinks in the kaggle coconut_l the majority of the files in 0 bytes
There seems to be some missing files in the COCONut Large dataset. Huggingface shows there are ~31 000 files, but according to your table there should be roughly 116 000 I think? There seems to be some images in the .tar file that symlinks to your file system e.g.
...object365_train_panseg_copy/objects365_v2_01860597.png ->../bytenas-lq-dxq/zipfile_panseg_copy/4/objects365_v2_01860597.png