reginabarzilaygroup / Sybil

Deep Learning for Lung Cancer Risk Prediction using LDCT
MIT License
62 stars 38 forks source link

I tried to use your preprocessing methods and pretrained model but it didn't work. May I check with you? #24

Closed ZiyangLiu closed 7 months ago

ZiyangLiu commented 9 months ago

Hello, I tried to use your preprocessing methods and pretrained model but it didn't work on my dataset. May I check with you about 3 questions?

(1) I used DCMTK with: dcmj2pnm +on2 --min-max-window --set-window -600 1500 pathToDCM pathToPNG16 Did I use DCMTK with the same command line as yours?

(2) I studied augmentations.py and followed its methods to change group of PNG16 into tensor ([mean std]=[128.1722, 87.1849] for normalization), (TorchIO to do interpolation, but I changed the voxel to 1.5 1.5 1.5 mm), (The tensor [min, max] was about [-1.4701, 1.4547]) (The tensor input into the model had shape (C T H W)) image The subplots in this image are some slices from the interpolated tensor (plot by plt.imshow(1SliceOfTensor, cmap='gray'))

Did I changed the PNG16 into tensor in the correct way? Do you think it's a bad idea to change its voxel to 1.5 1.5 1.5 mm (you use 1.4 1.4 2.5 mm)?

(3) I load your pretrained model's encoder back into standard r3d_18 and replace its last fc layer so that it can train on my 5 classes dataset.

resnet3d = torchvision.models.video.r3d_18(pretrained=True) path = "/path/to/65fd1f04cb4c5847d86a9ed8ba31ac1aepoch=10.ckpt" checkpoint = torch.load(path, map_location="cpu") (the layers' names in your pretrained model are different from the standard r3d_18 so that I change them back) state_dict = {("layer"+k[20:]): v for k, v in checkpoint["state_dict"].items()} state_dict["stem.0.weight"] = state_dict.pop("layer0.0.weight") state_dict["stem.1.weight"] = state_dict.pop("layer0.1.weight") state_dict["stem.1.bias"] = state_dict.pop("layer0.1.bias") state_dict["stem.1.running_mean"] = state_dict.pop("layer0.1.running_mean") state_dict["stem.1.running_var"] = state_dict.pop("layer0.1.running_var") state_dict["stem.1.num_batches_tracked"] = state_dict.pop("layer0.1.num_batches_tracked") model_dict_copy = resnet3d.state_dict() pretrained_dict = {k: v for k, v in state_dict.items() if k in model_dict_copy} model_dict_copy.update(pretrained_dict) resnet3d.load_state_dict(model_dict_copy)

Your pretrained model has other layers after the encoder but I am not sure whether I should use them. Do you think I use your pretrained model correctly?

I freezed the encoder and trained only the fc layer because my dataset is small. But the training accuracy stay very low (40% for 5 classes classification). Do you think I made any mistakes? Thank you very much.

pgmikhael commented 9 months ago

Hi,

Thanks for reaching out!

  1. I think that's the same as the DCMTK conversion command we use, which is:

    dcmj2pnm +on2 +Ww -600 1500  dicom_path  image_path
  2. You've got the right tensor dimensions, so that should be good. In terms of the voxel shape, I might keep the original parameters if you intend to use the pretrained weights, but definitely feel free to experiment as they can be context-dependent.

  3. In terms of the model, it may be more straightforward to initialize the SybilNet class (see for instance the load_model function in model.py). The additional layers perform pooling on the output activations from the 3D ResNet, and learn an attention. I would use the last hidden layer from that (see pool_output["hidden"] in SybilNet.

Then your model could look something like:

class NewModel(nn.Module):
    def __init__(self, checkpoint_path):
        super(NewModel, self).__init__()

        checkpoint = torch.load(checkpoint_path, map_location="cpu")
        args = checkpoint["args"]
        model = SybilNet(args)

        # Remove model from param names
        state_dict = {k[6:]: v for k, v in checkpoint["state_dict"].items()}
        model.load_state_dict(state_dict)  # type: ignore

        self.sybil_model = model 
        # some model architecture you wish to train
        self.classifier = nn.Linear(512, 5) 

    def forward(self, x):
        output = self.sybil_model(x)
        hidden = output["hidden"]
        y_hat = self.classifier(hidden)
        return y_hat

I don't see any major issues otherwise -- I would note that training dynamics can be volatile depending on the hyper-parameters.