xingyizhou / ExtremeNet

Bottom-up Object Detection by Grouping Extreme and Center Points
BSD 3-Clause "New" or "Revised" License
1.03k stars 172 forks source link

Issuse that training on my own data #16

Open zhaobinglei opened 5 years ago

zhaobinglei commented 5 years ago

Your work inspired me! However, I met some problems and really need your help! I have trained on my dataset and it's just like coco. But there are issues when training. Could you please help me!

Traceback (most recent call last):
  File "train.py", line 61, in prefetch_data
    data, ind = sample_data(db, ind, data_aug=data_aug, debug=debug)
  File "/data0/svc8/pytorchprojects/ExtremeNet/sample/coco_extreme.py", line 245, in sample_data
    return globals()[system_configs.sampling_function](db, k_ind, data_aug, debug)
  File "/data0/svc8/pytorchprojects/ExtremeNet/sample/coco_extreme.py", line 187, in kp_detection
    t_regrs[b_ind, tag_ind, :] = [fxt - xt, fyt - yt]
IndexError: index 128 is out of bounds for axis 1 with size 128
Process Process-2:
Traceback (most recent call last):
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "train.py", line 65, in prefetch_data
    raise e
  File "train.py", line 61, in prefetch_data
    data, ind = sample_data(db, ind, data_aug=data_aug, debug=debug)
  File "/data0/svc8/pytorchprojects/ExtremeNet/sample/coco_extreme.py", line 245, in sample_data
    return globals()[system_configs.sampling_function](db, k_ind, data_aug, debug)
  File "/data0/svc8/pytorchprojects/ExtremeNet/sample/coco_extreme.py", line 187, in kp_detection
    t_regrs[b_ind, tag_ind, :] = [fxt - xt, fyt - yt]
IndexError: index 128 is out of bounds for axis 1 with size 128
  0%|                                   | 12/250000 [01:18<453:10:32,  6.53s/it]Exception in thread Thread-1:
Traceback (most recent call last):
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "train.py", line 69, in pin_memory
    data = data_queue.get()
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd
    fd = df.detach()
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/connection.py", line 487, in Client
    c = SocketClient(address)
  File "/data0/svc8/anaconda3/envs/ExtremeNet/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
    s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
xingyizhou commented 5 years ago

Hi, it seems that in your dataset you have more than 128 objects in an image. You can increase the max object limit here or ignore the overflowed objects by changing this line to tag_lens[b_ind] += 1 if tag_lens[b_ind] < max_tag_len - 1 else 0.

zhaobinglei commented 5 years ago

Thanks a lot! I have fixed this problem. By the way, I used the Extreme-250000.pkl as the pre-trained model, and I have trained 20000 epochs on my dataset. But when I tested this new model on my dataset, there is no extreme point or bounding box. Is this normal?

yezhengli-Mr9 commented 5 years ago

Thanks a lot! I have fixed this problem. By the way, I used the Extreme-250000.pkl as the pre-trained model, and I have trained 20000 epochs on my dataset. But when I tested this new model on my dataset, there is no extreme point or bounding box. Is this normal?

I trained my own dataset (but my training data is small, only 100 labelled masks, and converting into JSON data by this ) **Extreme-18000.pkl and when predicting, there is no extreme point let alone bounding box.

However, I am not for sure whether my codes are correct since by Xingyi's Extreme-250000.pkl, it is definitely that I cannot find my extreme points.

yezhengli-Mr9 commented 5 years ago

Have you adjusted "categories" in config/*.json file, from 80 (coco) to your number? (I am debugging this since network layers need to be adjusted concurrently) @lesleyhaha

I think it may speed up but the performance may not be guaranteed.

yezhengli-Mr9 commented 5 years ago

Thanks a lot! I have fixed this problem. By the way, I used the Extreme-250000.pkl as the pre-trained model, and I have trained 20000 epochs on my dataset. But when I tested this new model on my dataset, there is no extreme point or bounding box. Is this normal?

I have just successfully get extreme points and bounding boxes by my own dataset (with merely 100 training samples) just with **Extreme-2000.pkl.@lesleyhaha

zhaobinglei commented 5 years ago

Thanks a lot! I have fixed this problem. By the way, I used the Extreme-250000.pkl as the pre-trained model, and I have trained 20000 epochs on my dataset. But when I tested this new model on my dataset, there is no extreme point or bounding box. Is this normal?

I have just successfully get extreme points and bounding boxes by my own dataset (with merely 100 training samples) just with **Extreme-2000.pkl.@lesleyhaha

Thank you so much! I will try.

zhaobinglei commented 5 years ago

I have just successfully get extreme points and bounding boxes by my own dataset (with merely 100 training samples) just with **Extreme-2000.pkl.@lesleyhaha

@yezhengli-Mr9 Thanks for your replying. However, I changed the categories in config/*.json and the network, there were still no extreme points and bounding boxes. Although the loss is from more than 200 to less than 10. Did you change the other configurations? Thanks!

annzheng commented 5 years ago

I have just successfully get extreme points and bounding boxes by my own dataset (with merely 100 training samples) just with **Extreme-2000.pkl.@lesleyhaha

@yezhengli-Mr9 Thanks for your replying. However, I changed the categories in config/*.json and the network, there were still no extreme points and bounding boxes. Although the loss is from more than 200 to less than 10. Did you change the other configurations? Thanks!

Hi, I have changed the categories in config/*.json and db/coco_extreme.py, but I got error:

RuntimeError: The shape of the mask [1, 1, 128, 128] at index 1 does not match the shape of the indexed tensor [1, 80, 128, 128] at index 1

What other places need to be changed?