dk-liang / FIDTM

[IEEE TMM] Focal Inverse Distance Transform Maps for Crowd Localization
MIT License
169 stars 41 forks source link

Queries on custom dataset #41

Closed akshu281 closed 1 year ago

akshu281 commented 1 year ago

Hello authors,

Thank you for the great work. I have the following queries in which I need your inputs, as I am trying to work on this crowd localization task.

May I check if there is an alternate backbone, say VGG has been released in this repository that can be used instead of the current backbone as it's slow while training a large number of images of my own custom dataset, or where do I refer it to

Let's say if my custom dataset having different scenes and of different resolution, can we just use the data preparation scripts (say NWPU) to generate the FIDT maps and supply in the training such that the longest size is not more than 2048

The loss function mentioned in the paper has been released already? Right now it's still MSE loss, like the general way

Also, is there any minimum validation loss to lookout to in general to decide the number of epochs or early stopping it

Thank you in advance for the response.

mariosconsta commented 1 year ago
  1. As far as I know, any backbone works but I have not tested it
  2. You can use/modify a data preparation file to work on your dataset. Personally I took the JHU file and modified it to work on my data
  3. The loss function is still not released
  4. I did not find early stopping in the script but it shouldn't be that hard to implement through pytroch. But again, I did not try this so I am not 100% sure
blessedDS commented 8 months ago

Hello authors,

Thank you for the great work. I have the following queries in which I need your inputs, as I am trying to work on this crowd localization task.

May I check if there is an alternate backbone, say VGG has been released in this repository that can be used instead of the current backbone as it's slow while training a large number of images of my own custom dataset, or where do I refer it to

Let's say if my custom dataset having different scenes and of different resolution, can we just use the data preparation scripts (say NWPU) to generate the FIDT maps and supply in the training such that the longest size is not more than 2048

The loss function mentioned in the paper has been released already? Right now it's still MSE loss, like the general way

Also, is there any minimum validation loss to lookout to in general to decide the number of epochs or early stopping it

Thank you in advance for the response.

hello , have you tried this to implement?