mahmoodlab / CLAM

Data-efficient and weakly supervised computational pathology on whole slide images - Nature Biomedical Engineering
http://clam.mahmoodlab.org
GNU General Public License v3.0
1.07k stars 350 forks source link

Adapting data loader to multi class & multi label #247

Closed Fel1xJulian closed 3 weeks ago

Fel1xJulian commented 5 months ago

Hi,

has anyone successfully implemented a multi class & multi label version of CLAM and can tell me what parts of the data loader function I would need to adapt and how or share the adapted code? I currently have both a csv containing all files and 3 separate csv files containing train, val and test splits available. In sum I have 5 classes multi-hot encoded e.g. 0,1,0,0,1. Any help would be much appreciated.

fedshyvana commented 3 weeks ago

This is not the most straightforward modification based on the current code base unfortunately. But if you really want to do it, you can start by updating the getitem function of the dataset class so that the multi-hot encoded labels are returned. Then you would need to modify the model / loss function so that sigmoid activation and binary cross entropy loss is used during training / inference.