spinal-cord-7t / coil-qc-code

7T Spinal Cord Coil QC Analysis Code
0 stars 0 forks source link

Uniformise code between phantom and human notebooks #104

Open jcohenadad opened 1 week ago

jcohenadad commented 1 week ago

This issue describes the various items that need to be taken care of to uniformise code between phantom and human notebooks.

Implement in phantom notebook: data dictionary:

https://github.com/spinal-cord-7t/coil-qc-code/blob/453b88897fa66d5c8e902fbe24ab20026aed7b4a/data_processing-human.ipynb#L583-L594

jcohenadad commented 1 week ago

Suggestion (from chatGPT) to use shared codebase:

The most elegant way to share code between two different Jupyter notebooks while minimizing duplication is to refactor the shared code into a separate, importable Python module. This allows you to define common functions, classes, and variables in a single file that both notebooks can access. Here’s how you can achieve this:

Step 1: Create a Separate Python Module for Shared Code

Create a new Python file (e.g., shared_code.py) that contains all the functions, classes, and variables you want to share between the notebooks. Place this file in the same directory as your Jupyter notebooks or in a subdirectory that is easily accessible.

For example, if you have two notebooks named notebook1.ipynb and notebook2.ipynb:

Step 2: Import the Shared Code in Each Notebook

In each of your Jupyter notebooks, import the shared code module using the following syntax:

# In notebook1.ipynb and notebook2.ipynb

from shared_code import preprocess_data, custom_plot, DataProcessor

# Now you can use the shared functions and classes
data = [...]  # Replace with actual data
preprocessed_data = preprocess_data(data)
processor = DataProcessor(preprocessed_data)
processor.process()
custom_plot(preprocessed_data, title="Example Plot")

Step 3: Optional — Use a notebooks Directory and Set PYTHONPATH

If you want to organize your project further, you can create a notebooks directory for your Jupyter notebooks and a src or utils directory for the shared code:

project_root/
│
├── notebooks/
│   ├── notebook1.ipynb
│   └── notebook2.ipynb
│
├── src/
    └── shared_code.py

To access the shared code from within notebook1.ipynb or notebook2.ipynb, you can modify the sys.path in each notebook to include the src directory:

import sys
sys.path.append('../src')  # Adjust path as needed

from shared_code import preprocess_data, custom_plot, DataProcessor

Step 4: Use Jupyter Notebook Magics for Development (Optional)

While developing the shared code, you can use the Jupyter notebook %autoreload magic to automatically reload changes from your shared module without restarting the kernel:

%load_ext autoreload
%autoreload 2

from shared_code import preprocess_data, custom_plot, DataProcessor

This approach will ensure that any changes you make to shared_code.py are reflected immediately in the notebooks without re-importing the module.

Benefits of This Approach

Alternative Approaches

  1. Jupyter Notebook %run Magic: You can place shared code in a separate Jupyter notebook (e.g., shared_code.ipynb) and use the %run magic command in both notebook1.ipynb and notebook2.ipynb:

    %run ./shared_code.ipynb

    However, this approach is less flexible compared to using a Python module and can lead to issues with dependency tracking.

  2. Jupyter Notebook Extensions: Use extensions like nbdev, which can turn Jupyter notebooks into modules and handle code sharing. This is particularly useful for complex projects but adds some overhead.

Final Recommendation

The most maintainable and scalable solution is to create a separate Python module (shared_code.py) and import it in each notebook. This keeps your notebooks clean, avoids code duplication, and allows you to organize your shared codebase effectively.

jcohenadad commented 1 week ago

Additional suggestions about the use of functions vs. script:

In your scenario, you have a large code block that uses variables defined earlier in the notebook, and you want to duplicate or share this code without manually copying it between notebooks. Since this code relies on variables like subjects, GAMMA, and voltages defined earlier, directly extracting it into a separate function in a module would require passing these variables as arguments.

Instead, you can employ a combination of approaches that allows for shared code execution while keeping variable dependencies intact. Here’s an elegant way to handle this:

Approach 1: Define a Reusable Function with Parameters in a Shared Module

Move the repetitive code into a function that accepts the necessary variables as parameters. This helps avoid code duplication while maintaining flexibility. Here's how you can do it:

  1. Create a shared_code.py module:

    # shared_code.py
    
    import os
    import json
    import nibabel as nib
    import numpy as np
    from pathlib import Path
    
    def process_subject(subject, path_data, GAMMA, voltages, siemens_loss_factor=-0.095):
       """
       Processes a subject's B1 maps based on given reference voltages and stores the average map.
    
       Parameters:
       subject (str): The subject identifier.
       path_data (str): The path to the data directory.
       GAMMA (float): The gyromagnetic ratio.
       voltages (list): List of reference voltages to check.
       siemens_loss_factor (float): The power loss factor for the Siemens coil. Default is -0.095.
    
       Returns:
       None
       """
       b1_maps = []
       os.chdir(os.path.join(path_data, subject, "fmap"))
    
       # Fetch the reference voltage depending on subject or from JSON
       if subject == 'sub-MSSM1':
           ref_voltage = 450
       elif subject == 'sub-MSSM2':
           ref_voltage = 350
       elif subject == 'sub-MSSM3':
           ref_voltage = 450
       else:
           with open(f"{subject}_acq-famp_TB1DREAM.json", "r") as f:
               metadata = json.load(f)
               ref_voltage = metadata.get("TxRefAmp", "N/A")
               if ref_voltage == "N/A":
                   ref_token = "N/A"
                   for token in metadata.get("SeriesDescription", "N/A").split("_"):
                       if token.startswith("RefV"): ref_token = token
                   ref_voltage = float(ref_token[4:-1])
    
       # Process initial reference flip angle map
       nii = nib.load(f"{subject}_acq-famp_TB1DREAM.nii.gz")
       meas_fa = nii.get_fdata()
       meas_fa[meas_fa < 200] = np.nan
       meas_fa[meas_fa > 500] = np.nan
    
       with open(f"{subject}_acq-famp_TB1DREAM.json", "r") as f:
           metadata = json.load(f)
           requested_fa = metadata.get("FlipAngle", "N/A")
           meas_fa = (meas_fa / 10) / requested_fa
    
       voltage_at_socket = ref_voltage * 10 ** siemens_loss_factor
       b1_map = meas_fa * (np.pi / (GAMMA * 1e-3 * voltage_at_socket))
       b1_map = b1_map * 1e9
       b1_maps.append(b1_map)
    
       # Process other reference voltage maps
       for voltage in voltages:
           my_file = Path(f"{subject}_acq-famp-{voltage}_TB1DREAM.nii.gz")
           if my_file.is_file():
               if subject == 'sub-MSSM2' and voltage == "1.5":
                   ref_voltage = 450
               elif subject == 'sub-MSSM2' and voltage == "0.66":
                   ref_voltage = 234
               elif subject == 'sub-MSSM3' and voltage == "0.66":
                   ref_voltage = 328
               else:
                   with open(f"{subject}_acq-famp-{voltage}_TB1DREAM.json", "r") as f:
                       metadata = json.load(f)
                       ref_voltage = metadata.get("TxRefAmp", "N/A")
                       if ref_voltage == "N/A":
                           ref_token = "N/A"
                           for token in metadata.get("SeriesDescription", "N/A").split("_"):
                               if token.startswith("RefV"): ref_token = token
                           ref_voltage = float(ref_token[4:-1])
    
               nii = nib.load(f"{subject}_acq-famp-{voltage}_TB1DREAM.nii.gz")
               meas_fa = nii.get_fdata()
               meas_fa[meas_fa < 200] = np.nan
               meas_fa[meas_fa > 500] = np.nan
    
               with open(f"{subject}_acq-famp-{voltage}_TB1DREAM.json", "r") as f:
                   metadata = json.load(f)
                   requested_fa = metadata.get("FlipAngle", "N/A")
                   meas_fa = (meas_fa / 10) / requested_fa
           else:
               meas_fa = np.full((nii.header).get_data_shape(), np.nan)
    
           voltage_at_socket = ref_voltage * 10 ** siemens_loss_factor
           b1_map = meas_fa * (np.pi / (GAMMA * 1e-3 * voltage_at_socket))
           b1_map = b1_map * 1e9
           b1_maps.append(b1_map)
    
       avgB1 = np.nanmean(b1_maps, axis=0)
    
       nii_avgB1 = nib.Nifti1Image(avgB1, nii.affine, nii.header)
       nib.save(nii_avgB1, f"{subject}_DREAMTB1avgB1map.nii.gz")
  2. In each Jupyter notebook:

    Import the shared function and call it with the required parameters. This allows you to execute shared code without duplicating it across notebooks.

    # notebook1.ipynb
    
    from shared_code import process_subject
    
    # Define the necessary variables
    GAMMA = 2.675e8  # [rad / (s T)]
    voltages = ["1.5", "0.66"]
    path_data = "/path/to/data"
    subjects = ["sub-MSSM1", "sub-MSSM2", "sub-MSSM3"]
    
    for subject in subjects:
       process_subject(subject, path_data, GAMMA, voltages)

Approach 2: Use the %run Magic with Pre-defined Variables

If you prefer to keep using variables without refactoring the code into a function, you can separate the repetitive code block into another Jupyter notebook (e.g., shared_code.ipynb) and use the %run magic command:

  1. Create a notebook called shared_code.ipynb that only contains the shared code (remove variable definitions).

  2. In the main notebook (e.g., notebook1.ipynb), define the variables first and then execute shared_code.ipynb:

    # notebook1.ipynb
    
    GAMMA = 2.675e8  # [rad / (s T)]
    voltages = ["1.5", "0.66"]
    path_data = "/path/to/data"
    subjects = ["sub-MSSM1", "sub-MSSM2", "sub-MSSM3"]
    
    %run ./shared_code.ipynb

In shared_code.ipynb, the code will use the variables defined in the main notebook. This approach keeps the execution within the notebook context but might be less clear if you have complex dependencies or want to share code outside Jupyter notebooks.