GlastonburyC / RNAPath

Self-supervised representation learning combining GTEx histology, RNA-seq and WGS
GNU General Public License v3.0
21 stars 4 forks source link

Question Regarding Coordinate Accuracy of Labelled Tiles #3

Closed Hanminghao closed 2 weeks ago

Hanminghao commented 3 weeks ago

Hello, author. Thank you very much for open-sourcing such fantastic work. I have some questions regarding the labelled tiles you provided. I attempted to stitch a WSI slice based on the coordinates indicated in each tile's JPG file name, but it seems that I ended up with some fragmented results. Could it be that there are errors in the coordinates of the tile filenames? Below are my code and the results I obtained. Please note that I only stitched tiles measuring 128x128 pixels.

import os  
from PIL import Image  

def get_coordinates(filename):  
    parts = filename.split('[')[1].split(']')[0].split(',')  
    x = int(parts[0].split('=')[1])  
    y = int(parts[1].split('=')[1])  
    return x, y  

def create_large_image(folder_path, scale_factor):  
    images = []  
    max_x = max_y = 0  

    for dirpath, dirnames, filenames in os.walk(folder_path):  
        for filename in filenames:  
            if filename.endswith('.jpg'):  
                x, y = get_coordinates(filename)  
                img_path = os.path.join(dirpath, filename)  
                image = Image.open(img_path)  
                images.append((image, x, y))  
                max_x = max(max_x, x)  
                max_y = max(max_y, y)  

    large_width = max_x + 128  
    large_height = max_y + 128  
    large_image = Image.new('RGB', (large_width, large_height), (255, 255, 255))  

    for img, x, y in images:  
        large_image.paste(img, (x, y))  

    if scale_factor > 0:  
        new_size = (int(large_image.width / scale_factor), int(large_image.height / scale_factor))  
        large_image = large_image.resize(new_size, Image.LANCZOS)  

    return large_image  

folder_path = '/Dataset4/hmh_data/image_classification/Patch_Annotations/SigmoidColon/GTEX-12C56-1025/'  
scale_factor = 2  

large_image = create_large_image(folder_path, scale_factor)   
large_image.save('GTEX-12C56-1025.jpg')  

GTEX-1PWST-1925⬇️ GTEX-1PWST-1925 GTEX-12C56-1025⬇️ GTEX-12C56-1025

francescocister commented 2 weeks ago

Dear @Hanminghao, Thank you! About that problem, if I understood you're trying to plot the annotated tiles we provided. Considering the first sample (Esophagus Mucosa), the coordinates look correct, as you can see in the WSI from GTEx: image We did not fully annotate the samples, as that was not necessary (we thought it was more useful to label the same regions across different samples to make it more robust), so this is the reason why you don't see the whole histology when you plot the tiles. I hope this solves your doubts, let me know whether you have any other problem! Francesco