Question about Dataloader

zengkaiwei commented 5 years ago

in your selftrain.py, the len(dataloader) is about 20 (each iteration len(dataloader) and len(images)is a little different)and the training is very fast. It seems that you choose a part of image to train. But according to len(dataloader) = len(images)/batchsize. For example, in duke2maket,in selftrain.py iteration 2 have 10105 training images, but the len(dataloader) is 18128（batch is 128） iteration 28 have 12285 images, but the len(dataloader) is 14128 it seems like len(dataloader) and len(images) are not strictly linearly, I suspect len(dataloader) is relation to both len(images) and num_ids. But I don't know why?

lsongx commented 5 years ago

Hi @zengkaiwei , we use PK sampling for triplet loss. Please see https://github.com/LcDog/DomainAdaptiveReID/blob/d078d4cc3de951f4e680e613e504664a9befb7a4/reid/utils/data/sampler.py#L11 for more details.

zengkaiwei commented 5 years ago

那个代码我之前看了，主要是这样，pk sample直接网上搜似乎没很清楚的说明，读代码有自己的理解但感觉不太靠谱，我对代码的理解是：对于所有选中的图片假设是5000个，一共有2000个不同的id，对于id里包含的图片大于4（num_instance）的时候，不放回的选取4张不同的图片，将其加入ret，对于id里图片小于4,放回的选取4张图片，将4张图片放入ret，最终呈现的效果是ret的长度既和new dataest的长度有关，也和id数量有关。还有就是小于4张图片的id越多，数据集相对会越大（有很多重复的图片？）。我观察了一下ret的长度，在每次iteration中其实变化不是特别大，pk sample 这种效果是你们故意为之，为了让训练更稳定？还是别的什么原因？

------------------ 原始邮件 ------------------ 发件人: "LSong"<notifications@github.com>; 发送时间: 2019年10月5日(星期六) 晚上11:25 收件人: "LcDog/DomainAdaptiveReID"<DomainAdaptiveReID@noreply.github.com>; 抄送: "人间@天堂"<weilantiankong2011@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [LcDog/DomainAdaptiveReID] Question about Dataloader (#12)

Hi @zengkaiwei , we use PK sampling for triplet loss. Please see https://github.com/LcDog/DomainAdaptiveReID/blob/d078d4cc3de951f4e680e613e504664a9befb7a4/reid/utils/data/sampler.py#L11 for more details.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

lsongx commented 5 years ago

@zengkaiwei 我们follow的是 https://arxiv.org/pdf/1703.07737.pdf 中介绍的方法. 请参考文中3.5的Batch Generation and Augmentation.

zengkaiwei commented 5 years ago

可以给我一个msmt17_V1 tar.gz的链接吗，官网已经关闭了

lsongx commented 5 years ago

@zengkaiwei 可以发邮件联系原作者要

lsongx / DomainAdaptiveReID

Question about Dataloader #12