Closed longmalongma closed 3 years ago
is the file actually there?
is the file actually there?
@hkchengrex 是的,我确定文件都存在,解压都用了好长时间,而且个文件都解压了,不知道为什么会报文件丢失的错。
@hkchengrex 而且文件目录中确实没有kea03423这个文件夹,但是我是把所有6个文件都下载了,而且都解压了。
@hkchengrex 而且文件目录中确实没有kea03423这个文件夹,但是我是把所有6个文件都下载了,而且都解压了。 你能否帮忙看下kea03423这个文件夹是哪个压缩包里面的
The path should be like /data/BL30K/Annotations/kea03423 The error is like /data/BL30K/a/BL30K/Annotations/kea03423 You are not pointing to the right path.
The path should be like /data/BL30K/Annotations/kea03423 The error is like /data/BL30K/a/BL30K/Annotations/kea03423 You are not pointing to the right path. @hkchengrex 我认为不是路径的问题,应该我将目录路径换了一下报错依然存在,有没有可能是你提供的压缩包里面缺少kea03423这个Annotations文件?我将6个压缩包全部解压了也没有找到这个文件。
How many folders do you have in JPEGImages, and how many folders do you have in Annotations?
How many folders do you have in JPEGImages, and how many folders do you have in Annotations? The number of folders in JPEGImages is 19350, The number of folders in Annotations is 19347. Three of the files are missing.
How many folders do you have in JPEGImages, and how many folders do you have in Annotations?
@hkchengrex kea03423文件夹在JPEGImages是存在的,但在Annotations中丢失了。 ![Uploading image.png…]()
I can confirm that both JPEGImages/kea03423
and Annotations/kea03423
exist in BL30K_d.tar
The corresponding MD5 checksum is e659ed7c4e51f4c06326855f4aba8109.
You might verify your download/extract again/see what's wrong. When fully extracted, BL30K_d.tar should give you 5000 videos.
The checksums for all the tar files have been updated in the main MiVOS repo.
How many folders do you have in JPEGImages, and how many folders do you have in Annotations?
@hkchengrex kea03423文件夹在JPEGImages是存在的,但在Annotations中丢失了。
The corresponding MD5 checksum
@hkchengrex What is the corresponding MD5 checksum? What is this for? If the file you provided does not supplement the missing file, it seems no use for me to download and decompress again.
我目前的解决办法好像是把 JPEGImages文件夹比Annotations文件夹多出来的那三个文件删除,但是这样做不知道会影响模型性能。
MD5 is for error checking: https://en.wikipedia.org/wiki/MD5. I just said that those folders exist in the original tar file. So there must be something wrong in between.
MD5 is for error checking: https://en.wikipedia.org/wiki/MD5. I just said that those folders exist in the original tar file. So there must be something wrong in between.
好的,谢谢您的帮助,应该是我哪里弄错了,我现在重新下载BL30K_d.tar并解压试下。
MD5 is for error checking: https://en.wikipedia.org/wiki/MD5. I just said that those folders exist in the original tar file. So there must be something wrong in between.
好的,谢谢您的帮助,应该是我哪里弄错了,我现在重新下载BL30K_d.tar并解压试下。
@hkchengrex 经过我的排查,我在Annotations中缺少的三个文件夹是:kea03423,kea05234 和keb03218,您能否将这三个文件夹打包给我发一下,因为我自己下载全部数据集太费时间了,在下载BL30K已经花费了很长时间了。Annotations和JPEGImages所有的文件数目分别是19350正确吗?我的邮箱:1442342449@qq.com。感谢。
Do the checksums match? You might be missing a lot more if you are missing three folders. If you really wanted to try you can just delete JPEGImages/[those folders] as we scan JPEGImages for candidates.
Do the checksums match? You might be missing a lot more if you are missing three folders. If you really wanted to try you can just delete JPEGImages/[those folders] as we scan JPEGImages for candidates.
@hkchengrex 是匹配的,除了在Annotations中缺失的这三个文件夹,剩下的19347个文件是匹配的,因为我在JPEGImages中把这三个文件夹删除以后,Annotations和JPEGImages所有的文件数目分别是19347,运行并没有报错,而且成功运行完一个epoch了,所以您能把丢失的这三个文件夹打包给我发一下吗?
If the checksums match you should just extract them again.
- Running an epoch successfully does not mean that you have all the files. In one epoch each video is sampled once (with three frames) only.
- You should have a lot more than 19347 if you downloaded all the packs. It is BL30K, so having just 19K videos would be a scam.
If the checksums match you should just extract them again.
@hkchengrex worenwei 我把6个压缩把全部下载了,您看下压缩包的大小合适不?
还要一点让我很困惑,我昨晚又把6个压缩包全部重新解压了一次,但是解压的时候我是将6个压缩包同时解压的,不是依次解压的,我昨天白天解压后得到19350个文件,晚上解压后却得到14347个文件,两次解压后得到文件数目是不一样的,我猜是不是不能同时解压,只能一个压缩包一个压缩包解压?
- Correct checksums tell you everything. Don't rely on file sizes.
- It shouldn't matter.
I will use MD5 for Correct checksums. MD5 is for error checking: https://en.wikipedia.org/wiki/MD5.
- Correct checksums tell you everything. Don't rely on file sizes.
- It shouldn't matter.
@hkchengrex I've re-downloaded the BL30K data set and extracted it twice, and the number of JPEGImage files is 24159, and the number of Annotations is 24157, is that correct?
No. Are the checksums correct?
checksums
I'm sorry, I don't know how to checksums.
md5sum [file]
in the terminal
md5sum
md5sum BL30K? or md5sum BL30K_a.tar?
On the tar files. I have provided the server-side checksums two weeks ago, you can check with that.
On the tar files. I have provided the server-side checksums two weeks ago, you can check with that.
If I use 24157 files for pre-training, will network performance not suffer too much?
I don't know how much stuff you are actually missing. If you have 24K full videos it should be fine.
I don't know how much stuff you are actually missing. If you have 24K full videos it should be fine.
The number of JPEGImage files in the BL30K data set I downloaded is 24159, and the number of comments is 24157. This is less than 3K videos, so I don't know if it will cause the accuracy to decrease
作者您好,我将BL30K的6个压缩包全部下载好,并全部解压之后,在进行第二个阶段的预训练时报错是找不到data/dangjisheng/BL30K/a/BL30K/Annotations/kea03423/00020.png',不知道为什么?我是把6个文件压缩包全部下载好而且全部解压在一个目录下的,为什么会报错缺少文件?期待您的回复。