i6092467 / semi-supervised-multiview-cbm

Concept bottleneck models for multiview data with incomplete concept sets
https://doi.org/10.1016/j.media.2023.103042
Other
9 stars 4 forks source link

Assistance Needed for Running Inference with a Trained Model #1

Open SomaRe opened 7 months ago

SomaRe commented 7 months ago

Hello,

I am currently working on a PyTorch assignment that involves choosing a research project (which utilizes PyTorch) for reproduction. The goal is to deepen our understanding of PyTorch through hands-on experience. After selecting the research paper and its associated code, I successfully executed the training phase and obtained the final model saved at log/checkpoints/final_model_test-42_mvcbm-app.pth. My next step is to implement an inference process, potentially integrating it into a simple Flask server, where the model predicts the likelihood of appendicitis from preprocessed images.

However, I've not been able to setup the inference pipeline to accept image inputs and return confidence scores. Despite my efforts and seeking assistance from my teaching assistant, even they couldn't help me.

Here's the skeleton of the code I attempted, but I recognize it lacks essential components for successful inference:

import torch
from networks import create_model 

def load_model(model_path, config):
    model = create_model(config) 
    model.load_state_dict(torch.load(model_path))
    model.eval()  
    return model

def prepare_image(image_path):
    # Assuming image is already preprocessed and just needs to be loaded
    image = torch.load(image_path)  
    if len(image.shape) == 3:
        image = image.unsqueeze(0) 
    return image

def infer(model, image, device='cuda'):
    image = image.to(device)  # Move image to the configured device
    with torch.no_grad():  # Inference doesn't require gradient calculation
        output = model(image)
        if isinstance(output, tuple):  # If your model returns multiple outputs, adjust accordingly
            output = output[0]  # Assuming the first output is what you're interested in
        predicted_prob = torch.sigmoid(output)  # Apply sigmoid if your output is logits for binary classification
        predicted_label = (predicted_prob > 0.5).int()  # Convert probabilities to binary labels
    return predicted_label.item()  # Return the prediction as a Python integer

model_path = 'log/checkpoints/final_model_test-42_mvcbm-app.pth'
config = {
    'experiment_name': 'mvcbm-app',
    'run_name': 'test-42',
    'seed': 42,
    'validate': False,
    'k_folds': 5,
    'device': 'cuda',
    'workers': 2,
    'log_directory': 'log',
    'model_directory': 'pretrained_models',
    'dataset': 'app',
    'num_classes': 2,
    'images': 'notebooks/preprocessed_ultrasound_images/constant_padding/deepfilled_cropped_train',
    'test_images': 'notebooks/preprocessed_ultrasound_images/constant_padding/deepfilled_cropped_test',
    'dict_file': 'notebooks/data_dictionaries/diagnosis/app_data_train',
    'dict_file_test': 'notebooks/data_dictionaries/diagnosis/app_data_test',
    'preload': False,
    'augmentation': True,
    'aug_per_sample': 1,
    'hist_equal': True,
    'normalize': False,
    'brightness': True,
    'rotate': True,
    'shear': True,
    'resize': True,
    'gamma': True,
    'sharpness': True,
    'gaussian_noise': True,
    'poisson_noise': False,
    'SP_noise': False,
    'zero_rect': 0.05,
    'model': 'MVCBM',
    'encoder_arch': 'ResNet18',
    'aggregator': 'lstm',
    'training_mode': 'sequential',
    'alpha': 1.0,
    'num_concepts': 9,
    't_hidden_dim': 5,
    'norm_bottleneck': False,
    'attention': False,
    'c_epochs': 20,
    'c_learning_rate': 0.0001,
    't_epochs': 20,
    't_learning_rate': 0.01,
    'j_epochs': 40,
    'j_learning_rate': 0.001,
    'train_batch_size': 4,
    'val_batch_size': 4,
    'optimizer': 'adam',
    'decrease_every': 150,
    'lr_divisor': 2,
    'weight_decay': 0,
    'validate_every_epoch': True,
    'ex_features': []
}

image_path = 'notebooks/preprocessed_ultrasound_images/constant_padding/deepfilled_cropped_test/13.1_Appendix.png'  # Path to your preprocessed image

# Load the model
model = load_model(model_path, config)

# Prepare the image
image = prepare_image(image_path)

# # Run inference
# prediction = infer(model, image)
# print(f'Prediction: {prediction} (0: No Appendicitis, 1: Appendicitis)')

Could anyone provide guidance or examples on how to properly load the trained model and perform inference with image/images data? Any insights or pointers to would be immensely appreciated.

My confusion also stems from how to handle multiple images for inference.

Thank you for your time and assistance.

armoaguille commented 4 months ago

Hey, I have the same problem with implementing a the modelo. Have you been able to resolve the issue, or could you share what steps you took in implementing your code? Any insights or tips would be greatly appreciated! Thanks in advance.