simple question about data loader

lxtGH commented 6 years ago

Nice implementation! but the dataloader isn't pytorch way, so what imformantion contained in the im_info

ahmed-shariff commented 6 years ago

From what I gather, im_info is a Nx3 tensor. N - batch size the three values: image height, image width, the third value I am not sure. I was able to train the network by completely ignoring the third value.

edit: typo

lxtGH commented 6 years ago

Yes, me too. The I want to know why it have the third value

lxtGH commented 6 years ago

@ahmed-shariff I found the third value is used in testnet, I still don't understand why have this value

lxtGH commented 6 years ago

@jwyang Hi why you do this following operation to keep the same dimension in each batch ? ` for i in range(num_batch): left_idx = ibatch_size right_idx = min((i+1)batch_size-1, self.data_size-1)

    if ratio_list[right_idx] < 1:
        # for ratio < 1, we preserve the leftmost in each batch.
        target_ratio = ratio_list[left_idx]
    elif ratio_list[left_idx] > 1:
        # for ratio > 1, we preserve the rightmost in each batch.
        target_ratio = ratio_list[right_idx]
    else:
        # for ratio cross 1, we make it to be 1.
        target_ratio = 1`

RLWH commented 5 years ago

@jwyang Hi why you do this following operation to keep the same dimension in each batch ? ` for i in range(num_batch): left_idx = ibatch_size right_idx = min((i+1)batch_size-1, self.data_size-1)
    if ratio_list[right_idx] < 1:
        # for ratio < 1, we preserve the leftmost in each batch.
        target_ratio = ratio_list[left_idx]
    elif ratio_list[left_idx] > 1:
        # for ratio > 1, we preserve the rightmost in each batch.
        target_ratio = ratio_list[right_idx]
    else:
        # for ratio cross 1, we make it to be 1.
        target_ratio = 1`

Hi @lxtGH ,

I've done a bit of investigation in the code. Recalling that the ratio_list here is already ranked by the function rank_roidb_ratio(roidb) in ascending order, meaning that the leftmost element has the smallest ratio (Should be 0.5, since we have a threshold of 0.5 (aka 1:2)), and the rightmost element has the largest ratio (Should be 2, since we have a cap of 2 (aka 2:1)).

If the rightmost element is still less than one, which means all images in that batch has their widths shorter than their lengths. Thus, we take the leftmost ratio. Vice versa.

In this case, I think it is easier to fix it to 0.5 (The minimum threshold) or 2 (The maximum threshold) if all the images in the batch have ratios < 1 or >1 respectively.

Please correct me if have any wrong understanding

jwyang / faster-rcnn.pytorch

simple question about data loader #146