WonderLandxD / WiKG

[CVPR 2024] Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis
36 stars 1 forks source link

Memory Consumption #3

Open shubhaminnani opened 1 month ago

shubhaminnani commented 1 month ago

Hi,

Thank you for wonderful model. When I tried to train the model, I see that around 48GB of memory is required by the model to train, was this in your case too?

Thanks, Shubham

WonderLandxD commented 1 month ago

Hello,

The reason for the increase in memory may be due to the large number of cropped patches. In the TCGA dataset we use, since the number of patches per slide is small (up to about 15,000), 24G memory can accommodate it. But when using other datasets for experiments (for example, when the number of patches is greater than 20,000), 24G memory maybe not enough. Still, using a GPU with larger memory to train WiKG is worth trying. We are currently discussing the implementation of WiKG v2, which focuses on model complexity and scaling. We will release it when it is appropriate, so stay tuned.

There are several solutions:

  1. Use a larger segmentation threshold to reduce the number of patches. Fewer patches even improve overall performance sometimes.

  2. Use mixed precision to train the model.

  3. Reduce the number of neighbor nodes. We found that the number of different nodes in the graph representation method (not just the aggregation strategy proposed by WiKG) does not have much difference in the results when analyzing WSIs (this is also a problem worth exploring).

  4. You can consider setting a smaller feature length in the fully connected layer at the beginning of the model input. Some of our experiments found that 512 may be unnecessary.

Hope this helps you:)