taohan10200 / IIM

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"
MIT License
163 stars 39 forks source link

Training on general dataset #30

Open akshu281 opened 1 year ago

akshu281 commented 1 year ago

Hello authors,

I am trying to put together a custom dataset which has both free view and surveillance view images. May I check the following with you.

1) What's the general code to refer to for preparing data using scale detect approach as the one with traditional one is giving me continuous blobs for a few images if I try to run a common way of generating the maps for all of my images 2) Also I understand the training process may take a longer time and resources to train. In my case my custom dataset is around 7K and the resolution varies from small to high. Would you advise which backbone or parameters I can borrow here to use in 2 GPUs with around 8-11 GB memory

Thank you in advance for the timely acknowledgment and response

taohan10200 commented 1 year ago

Hi, 1) The code in datasets/dataset_prepare could be helpful to generate the binary map for your dataset. If your dataset is without bounding box annotation or scale map. You can set a fixed box size to generate an instance map, which may loss some performance but can help to start the training. 2) Considering your device, we suggest you use the VGG backbone in this repo.

akshu281 commented 1 year ago

Thank you for the prompt reply @taohan10200 ! My annotations are all point wise for the dataset. May I check if I can use the scale prediction network and get the maps? Is there any change that I should be making to the codes under datasets/dataset_prepare as I see designated codes for different dataset.

Also may I check if resizing module as part of dataset preparation is fixed to any size for any dataset we use?