create a qupath project from the model inference results

kaczmarj commented 1 year ago

the wsinfer run cli should be able to generate a qupath project directly from the slides and model outputs. this would make qupath integration tighter and make it easier for users to visualize results in qupath.

qupath projects are described here https://qupath.readthedocs.io/en/0.4/docs/tutorials/projects.html

the project information is stored in a file and the goal here would be to generate this file.

kaczmarj commented 1 year ago

@swaradgat19 - could you take a look into this? you will have to get a bit familiar with using qupath https://qupath.readthedocs.io

try to create a project using the qupath gui (described at https://qupath.readthedocs.io/en/0.4/docs/tutorials/projects.html). then look at that saved project file and try to reconstruct it with code. that will give us a good idea of how to create one of these files within wsinfer.

swaradgat19 commented 1 year ago

@kaczmarj I was going through how we can go about the integration. As I understand it, when we're running wsinfer run, a QuPath project (essentially a file with .qpproj extension) is created and all the results should be stored in that project. I found a Python library named 'Paquo', which helps us read and write QuPath projects from Python. I'm currently testing it.

swaradgat19 commented 1 year ago

@kaczmarj I'm running a script that makes a new project named NewProject (on my local machine) and I'm adding an svs file to it.

from paquo.projects import QuPathProject
from paquo.images import QuPathImageType

qp = QuPathProject("NewProject/", mode='a')

qp.add_image('SlideImages/CMU-1-JP2K-33005.svs', image_type=QuPathImageType.OTHER)

Getting this issue:

Traceback (most recent call last):
  File "/Users/swaradgat/Desktop/University/BMI/qupath_script.py", line 1, in <module>
    from paquo.projects import QuPathProject
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packag
swaradgat@Swarads-MacBook-Air BMI % /usr/local/bin/python3 /Users/swaradgat/Desktop/University/BMI/qupath_script.py
Traceback (most recent call last):
  File "/Users/swaradgat/Desktop/University/BMI/qupath_script.py", line 1, in <module>
    from paquo.projects import QuPathProject
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paquo/projects.py", line 23, in <module>
    from paquo._logging import get_logger
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paquo/_logging.py", line 13, in <module>
    from paquo.java import ByteArrayOutputStream
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paquo/java.py", line 32, in <module>
    qupath_version = start_jvm(finder_kwargs=to_kwargs(settings))
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paquo/jpype_backend.py", line 298, in start_jvm
    _version = str(JClass("qupath.lib.common.GeneralTools").getVersion())
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/jpype/_jclass.py", line 99, in __new__
    return _jpype._getClass(jc)
jpype._core.JVMNotRunning: Java Virtual Machine is not running

Not sure why I'm getting this issue. QuPath works fine on my local system.

kaczmarj commented 1 year ago

As I understand it, when we're running wsinfer run, a QuPath project (essentially a file with .qpproj extension) is created and all the results should be stored in that project

yes, precisely.

paquo looks like a great tool for this. qupath vendors the jvm so it's likely that paquo/jpype cannot find the jvm.

I was looking at https://jpype.readthedocs.io/en/latest/api.html#jpype.startJVM and it looks like one fix is setting the JAVA_HOME environment variable to the path that includes libjvm.so or jvm.dll. in the command-line example below, the JAVA_HOME points to the internal qupath directory that includes the jvm shared library.

JAVA_HOME=~/.local/opt/QuPath/lib/runtime/lib/server/ python -c 'import jpype; jpype.startJVM()'

let me know if setting JAVA_HOME solves that issue.

swaradgat19 commented 1 year ago

I tried setting it (ran the exact command on the terminal) but ran into the same error.

Paquo tries to search for the QuPath directory. We can customize where paquo searches QuPath in the .paquo.toml file. I set the qupath_dir in the toml file to the directory of QuPath app:

# current paquo configuration
# ===========================
# format: TOML
qupath_dir = "/Applications/QuPath.app"
qupath_search_dirs = [
    "/opt",
    "/Applications",
    "c:/Program Files",
    "/usr/local",
    "~/Applications",
    "~/AppData/Local",
    "~",
]
qupath_search_dir_regex = "(?i)qupath.*"
qupath_search_conda = true
qupath_prefer_conda = true
java_opts = [
    "-XX:MaxRAMPercentage=50",
]
safe_truncate = true
jvm_path_override = ""
mock_backend = false
cli_force_log_level_error = true
warn_microsoft_store_python = true

Still the same issue. Changing the jvm_path_override might help. Working on it

swaradgat19 commented 1 year ago

I was trying this on macOS. I tried it on the remote server. In the /home/sggat/wsinfer/wsinfer directory, I made a directory of slide images ( SlideImages/). Made a QuPath project and added that svs image to the project.

Error:

Traceback (most recent call last):
  File "/home/sggat/wsinfer/wsinfer/qupath_script.py", line 1, in <module>
    from paquo.projects import QuPathProject
  File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/projects.py", line 23, in <module>
    from paquo._logging import get_logger
  File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/_logging.py", line 13, in <module>
    from paquo.java import ByteArrayOutputStream
  File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/java.py", line 32, in <module>
    qupath_version = start_jvm(finder_kwargs=to_kwargs(settings))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/jpype_backend.py", line 198, in start_jvm
    app_dir, runtime_dir, jvm_path, jvm_options = finder(**finder_kwargs)
                                                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/jpype_backend.py", line 111, in find_qupath
    raise ValueError("no valid qupath installation found")
ValueError: no valid qupath installation found

This is raised because QuPath hasn't been installed. Maybe someone with sudo access can install it so that we can test it.

kaczmarj commented 1 year ago

Good idea to try it on different systems. We don’t need sudo to install qupath. You can download and unpack the linux release from here https://github.com/qupath/qupath/releases/tag/v0.4.3Best,JakubOn Jul 26, 2023, at 16:04, Swarad Gat @.*> wrote: I was trying this on macOS. I tried it on the remote server. In the /home/sggat/wsinfer/wsinfer directory, I made a directory of slide images ( SlideImages/ ). Made a QuPath project and added that svs image to the project. Error: Traceback (most recent call last): File "/home/sggat/wsinfer/wsinfer/qupath_script.py", line 1, in from paquo.projects import QuPathProject File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/projects.py", line 23, in from paquo._logging import get_logger File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/_logging.py", line 13, in from paquo.java import ByteArrayOutputStream File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/java.py", line 32, in qupath_version = start_jvm(finder_kwargs=to_kwargs(settings)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/jpype_backend.py", line 198, in start_jvm app_dir, runtime_dir, jvm_path, jvm_options = finder(finder_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sggat/anaconda3/lib/python3.11/site-packages/paquo/jpype_backend.py", line 111, in find_qupath raise ValueError("no valid qupath installation found") ValueError: no valid qupath installation found This is raised because QuPath hasn't been installed. Maybe someone with sudo access can install it so that we can test it.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

swaradgat19 commented 1 year ago

Thanks. I installed QuPath and ran the script and it made a new directory (NewProject). The folder contained the following files/folders:

classifiers data project.qpproj

I checked the thumbnail.jpg and it shows the image correctly. So it seems to be working.

swaradgat19 commented 1 year ago

@kaczmarj This script adds all the images inside a folder into the QuPath project. Where do we intend to add this to the project? Is it in the run_inference.py file?

import os
from paquo.projects import QuPathProject
from paquo.images import QuPathImageType

DIRECTORY = "SlideImages"
qp = QuPathProject("QuPathProject/", mode='a')

for filename in os.listdir(DIRECTORY):

    try:
        qp.add_image(f"{DIRECTORY}/{filename}", image_type=QuPathImageType.OTHER)
        print(f"{DIRECTORY}/{filename} added successfully!")
    except:
        print(f"{DIRECTORY}/{filename} already exists")

kaczmarj commented 1 year ago

we should create a new namespace for this functionality. let's put it in a new file wsinfer/qupath.py.

we would want to add the images to the qupath project and we want to add the model predictions for each image too.

swaradgat19 commented 1 year ago

Sure. In the results folder that is generated, I'm assuming we want to save the masks

kaczmarj commented 1 year ago

we don't necessarily need the masks now. most important at this stage are the model outputs.

swaradgat19 commented 1 year ago

Okay. Just wanted to confirm what the model outputs are. Could you tell me which directory you're referring to?

kaczmarj commented 1 year ago

yes, after one runs wsinfer run, the model outputs are stored in OUTPUT_DIR/model-outputs. that directory contains one CSV file per whole slide image. those CSV files store the model outputs in columns with the prefix prob_ (those are softmax probabilities).

the coordinates in the CSVs (minx, miny, width, height) are in pixels at the base level resolution of the whole slide image.

swaradgat19 commented 1 year ago

Got it. So this csv will give us the heatmap probabilities. How should we save these model predictions in the QuPath project? Like should I save it as a heatmap probability image or a csv file itself?

kaczmarj commented 1 year ago

ah i see. I would save them as Tile objects (qupath defines the Tile object). see the geojson representation of what i mean below:

https://github.com/SBU-BMI/wsinfer/blob/d8f9982f682aae66cbf438392af432337e1001f2/wsinfer/cli/convert_csv_to_geojson.py#L31-L49

each row in a CSV would be a separate Tile object with associated probabilities.

swaradgat19 commented 1 year ago

@kaczmarj I've written a script that adds svs files to a QuPath Project. ( I'm accessing the minx, miny, etc values and printing them for my understanding. )

from __future__ import annotations
import os
from paquo.projects import QuPathProject
from paquo.images import QuPathImageType
import numpy as np

import json
from pathlib import Path

import click
import pandas as pd
import tqdm

DIRECTORY = "../wsinfer/wsinfer/SlideImages"
MODEL_OUTPUTS = "../wsinfer/wsinfer/Results/model-outputs"

qp = QuPathProject("QuPathProject/", mode='a')

# Adds images to the QuPath Project
for filename in os.listdir(DIRECTORY):

    try:
        qp.add_image(f"{DIRECTORY}/{filename}", image_type=QuPathImageType.OTHER)
        print(f"{DIRECTORY}/{filename} added successfully!")
    except:
        print(f"{DIRECTORY}/{filename} already exists")

    filename_csv = filename.split(".")[0] + ".csv" 

    df = pd.read_csv(f"{MODEL_OUTPUTS}/{filename_csv}")

    print(f"filename --> {filename_csv}, df -- >  \n{df} ")

    # print("entire row --> \n",df.iloc[0])
    # print("minx part of row -->", df.iloc[0]["minx"])

    row = df.iloc[0]
    minx, miny, width, height = row["minx"], row["miny"], row["width"], row["height"]
    print(f"minx = {minx}, miny = {miny}, width = {width}, height = {width}")

I just wanted to clarify. I've run the pipeline using the wsinfer run command ( on two .svs files I got from a link) and got the results in the Results folder. Now, I have to import the csv files ( model-outputs of both the images) and save them as tile objects, correct?

I was trying to get an intuition of what exactly the Tile object is on QuPath. I defined an ROI (using a circle) and QuPath automatically made tiles according to the shape. So using the csv files of each image, we have to save these tiles (along with the probabilities) in the svs file itself?

kaczmarj commented 1 year ago

sorry for missing this earlier @swaradgat19 .

So using the csv files of each image, we have to save these tiles (along with the probabilities) in the svs file itself?

yes exactly.

here's the workflow i use for visualizing the wsinfer outputs in qupath.

use wsinfer run to get model outputs (as you've done)
convert the model output CSVs to geojson using wsinfer togeojson
open one svs file in QuPath
drag and drop the geojson file for that svs file into the qupath window.
this should show tiles overlaid on the svs image. each of those tiles has a model output associated with it.
you can save that as a qupath project, and the project will include the tiles (along with their model outputs)

what we want to do in this issue is get to step 6 using the command line.

swaradgat19 commented 1 year ago

Got it. That cleared a lot of doubts. I dragged the JSON file onto the appropriate image and it gave me this layout. Is this how it's supposed to look like?

I'm currently implementing this using paquo api.

kaczmarj commented 1 year ago

yes, and if you zoom in, you will see more tiles.

press the button with solid green to fill in the tiles (fifth button from the left)

and in the top menu, click Measure -> Show measurement maps. this will open a window with the model outputs. click on one of the names, and a heatmap should appear on the slide image.

kaczmarj commented 1 year ago

qupath distinguishes between "annotations" and "detections". we will be adding the tiles as detections. and the qupath objects we need to create are PathTileObject.

i think the easiest for now would be to try paquo's load_geojson function

swaradgat19 commented 1 year ago

The authors have fixed it I believe. I'll try running this with the latest paquo version

kaczmarj commented 1 year ago

yes, the issue was fixed by https://github.com/Bayer-Group/paquo/pull/105 . if we use paquo, we should use pin paquo>=0.7.1

swaradgat19 commented 1 year ago

Do we need to change the geojson format as specified in #178? Because I upgraded paquo to 0.7.1 and it was giving me the same error

kaczmarj commented 1 year ago

Yes you will. Could you submit a new pr only to make that geojson format change?Best,JakubOn Aug 5, 2023, at 16:00, Swarad Gat @.***> wrote: Do we need to change the geojson format as specified in #178? Because I upgraded paquo to 0.7.1 and it was giving me the same error

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

swaradgat19 commented 1 year ago

Sure

swaradgat19 commented 1 year ago

Just wanted to update. I loaded the new geojson files into the images using paquo and it gives no errors. We can only confirm with QuPath. Maybe you could try running it on a Linux machine. I do not have one :)


from pathlib import Path
import os
import geojson

from paquo.images import QuPathImageType
from paquo.projects import QuPathProject
from paquo.hierarchy import QuPathPathObjectHierarchy

IMAGE_DIRECTORY = "../wsinfer/wsinfer/SlideImages"
MODEL_OUTPUTS = "../wsinfer/wsinfer/Results/model-outputs"

# Adds images to the QuPath Project
with QuPathProject('QuPathProject/', mode="a") as qp:
    for filename in os.listdir(IMAGE_DIRECTORY):
        filename = f"{IMAGE_DIRECTORY}/{filename}"
        print(f"filename --> {filename}")
        try:
            qp.add_image(f"{filename}", image_type=QuPathImageType.OTHER)
            print(f"{IMAGE_DIRECTORY}/{filename} added successfully!")
        except:
            print(f"{IMAGE_DIRECTORY}/{filename} already exists")

###
# Generate the geojson files automatically here 
###

# Equivalent of dragging and dropping geojson file to the respective svs file
geoJSON_DIRECTORY = "../wsinfer/wsinfer/geojson-results/"
idx = 0
with QuPathProject('QuPathProject/', mode="a") as qp:
    for filename in os.listdir(f"{geoJSON_DIRECTORY}"):

        with open(f"{geoJSON_DIRECTORY}/{filename}") as f:
            detections = geojson.load(f)

        feature_list = detections["features"]
        qp.images[idx].hierarchy.load_geojson(feature_list)
        idx += 1

I'm yet to write the code where the geojson results are generated automatically after the wsinfer run command

kaczmarj commented 1 year ago

hi @swaradgat19 - your paquo script worked, the only thing i had to do was tell paquo where my qupath was installed with the environment variable PAQUO_QUPATH_DIR=~/opt/QuPath/ because i installed qupath into a relatively unconventional place.

i refactored your script and pasted it below. instead of using os.listdir twice to get the slide paths and geojson paths, i would be more comfortable constructing a list of slide+geojson pairs. my concern with using os.listdir twice is that we are relying on the order to be the same for the two directories and i don't think that can be guaranteed.

another change was using the returned object of qp.add_image to add the geojson to that specific image. this gives us a guarantee that we're adding the geojson to the correct image (instead of indexing the image using the order of os.listdir).

from __future__ import annotations

import json
from pathlib import Path

from paquo.projects import QuPathProject

QUPATH_PROJECT_DIRECTORY = "QuPathProject"

def add_image_and_geojson(
    qupath_proj: QuPathProject, *, image_path: Path | str, geojson_path: Path | str
) -> None:
    with open(geojson_path) as f:
        # FIXME: check that a 'features' key is present and raise a useful error if not
        geojson_features = json.load(f)["features"]

    entry = qupath_proj.add_image(image_path)
    # FIXME: test that the 'load_geojson' function exists. If not, raise a useful error
    entry.hierarchy.load_geojson(geojson_features)  # type: ignore

# Store a list of matched slides and geojson files. Linking the slides and geojson in
# this way prevents a potential mismatch by simply listing directories and relying on
# the order to be the same.
slides_and_geojsons = [
    ("slideA.svs", "slideA.json"),
    ("slideB.svs", "slideB.json"),
    ("slideC.svs", "slideC.json"),
]

with QuPathProject(QUPATH_PROJECT_DIRECTORY, mode="w") as qp:
    for image_path, geojson_path in slides_and_geojsons:
        try:
            add_image_and_geojson(qp, image_path=image_path, geojson_path=geojson_path)
        except Exception as e:
            print(f"Failed to add image/geojson with error:: {e}")

regarding how to incorporate this into wsinfer run, one option is to add a command line flag --qupath to create a qupath project after all the processing is done.

i am also thinking that because geojson is such a useful format, most people might want to have it with their results. we should probably also add a --geojson flag to wsinfer run that will automatically run the geojson conversion at the end of model inference.

to handle this, the togeojson cli will have to be refactored. the logic of the actual conversion should be put in a new file (maybe wsinfer/geojson.py. and then those methods could be used when creating the qupath project.

swaradgat19 commented 1 year ago

@kaczmarj My bad. I forgot to mention, we need to make changes to the .paquo.toml file so that paquo finds the location of QuPath. It can be generated using python -m paquo config --list --default. It usually searches but if it's in an unconventional location, we have to specify the location. We can automate that as well

swaradgat19 commented 1 year ago

Thanks for the changes! I was also thinking the same. We would need geojson files if the user wants to create a qupath project. Maybe we can add a command line flag --qupath and generate the geojson whenever they mention this flag, instead of keeping it (togeojson) as a separate command.

kaczmarj commented 1 year ago

Maybe we can add a command line flag --qupath and generate the geojson whenever they mention this flag, instead of keeping it (togeojson) as a separate command.

yes i agree!

we need to make changes to the .paquo.toml file so that paquo finds the location of QuPath

for now, let's plan to include some documentation of how one can specify the qupath directory if it cannot be found. we can also raise a useful error if paquo complains that it cannot find qupath. that error can explain how to set the qupath location.

swaradgat19 commented 1 year ago

@kaczmarj Paquo does search in all these directories:

# default search paths for qupath
qupath_search_dirs = [
  "/opt",  # linux
  "/Applications",  # macos
  "c:/Program Files",  # win
  "/usr/local",  # linux
  "~/Applications",  # macos
  "~/AppData/Local",  # win
  "~",
]

In most cases, it should get the directory (across all kinds of OS). But making the user enter the exact path might be a tedious option. How should we go about it? I could start working on this

kaczmarj commented 1 year ago

nice find. these paths cover basically all of the typical places qupath would be installed. i think we should simply document that qupath will be found automatically, and if that fails, one can use an environment variable to tell paquo where qupath lives.

kaczmarj commented 1 year ago

to move forward with this issue, i would create a new file wsinfer/qupath.py and implement the qupath project creation. you can also pull the geojson conversion into its own module. you can also modify wsinfer run to add a --qupath option and create the geojson files automatically.

i am considering creating geojson files at the end of wsinfer run all the time.... what do you think? geojson is a very useful / common format, and it might make sense to have it available by default.

swaradgat19 commented 1 year ago

Yes, creating geojson after wsinfer run would be a great idea, although we'll have to specify the geojson directory everytime. Or we could just store it in the results folder itself.

swaradgat19 commented 1 year ago

Additionally, there can be a --qupath option where the user can create a QuPath project and store the images there. Since we'll already have the geojson files, we can easily create the project, along with the probabilities loaded into each image.

kaczmarj commented 1 year ago

we can store the geojsons in RESULTS_DIR/model-outputs-geojson.

so the results directory would have masks, patches, model-outputs, and model-outputs-geojsons.

although it might make sense to rename model-outputs to model-outputs-csv to clarify that the model outputs exist in two formats but are otherwise the same

swaradgat19 commented 11 months ago

@kaczmarj Should I work on this issue for now, until the docker and Ubuntu tests are resolved?

kaczmarj commented 11 months ago

@swaradgat19 - yes please! i will try to resolve those tests today.

swaradgat19 commented 11 months ago

@kaczmarj One issue with creating a qupath project is that the user has to manually make a toml file and enter the directory in which QuPath is present.

I was thinking of automating the process of creating a toml file in the current working directory (Path.cwd() ). While this is done, we will have to take the input from the user regarding the QuPath directory (Paquo tries to search QuPath in some default paths but it isn't able to find the QuPath on my system).

Do you think we should take the QuPath directory as input from the user?

swaradgat19 commented 11 months ago

Paquo searches in the following directories:

qupath_search_dirs = [
    "/opt",
    "/Applications",
    "c:/Program Files",
    "/usr/local",
    "~/Applications",
    "~/AppData/Local",
    "~",
]

While this might work on most systems, it isn't working on mine, since my QuPath is installed in /home/sggat/QuPath

kaczmarj commented 11 months ago

While this might work on most systems, it isn't working on mine, since my QuPath is installed in /home/sggat/QuPath

are you using linux? the majority of qupath users use windows or mac, so the default search dirs should find qupath for most users.

i also tend to use linux so i'd have to tell the script where qupath is installed. instead of modifying the toml file, one can also use PAQUO_QUPATH_DIR which might be easier.

in any case, we should document these things on our website. we can inform users that if they run into an error that qupath can't be found, they can provide the qupath location with PAQUO_QUPATH_DIR or by modifying the toml file.

Do you think we should take the QuPath directory as input from the user?

no, i don't think so. if there is an error that qupath can't be found, we should print a clear error message with instructions of how the user can fix that. the easiest fix imho is to use the environment variable PAQUO_QUPATH_DIR.

swaradgat19 commented 11 months ago

@kaczmarj Yes I'm using Linux (Harrier). Yes, setting PAQUO_QUPATH_DIR should be easier. Sure, I'll put an error message if the path isn't found

swaradgat19 commented 10 months ago

Was going through paquo documentation when I got this piece of code:

with QuPathProject('./my_qupath_project', mode='r+') as qp:
    qp.update_image_paths(uri2uri={"file:/somewhere_else/image_1.svs": "file:/share/image_1.svs"})
    assert all(qp.is_readable().values())

Link to docs

We can add an additional feature where the QuPath projects can be transferred around in different directories. Just wanted to note it down here

kaczmarj commented 10 months ago

Perfect! That will be very useful On Oct 4, 2023, at 01:45, Swarad Gat @.***> wrote: Was going through paquo documentation when I got this piece of code: with QuPathProject('./my_qupath_project', mode='r+') as qp: qp.update_image_paths(uri2uri={"file:/somewhere_else/image_1.svs": "file:/share/image_1.svs"}) assert all(qp.is_readable().values()) Link to docs We can add an additional feature where the QuPath projects can be transferred around in different directories. Just wanted to note it down here

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

swaradgat19 commented 9 months ago

Wrote a script to take the qupath installation path from the user if not found. Also I'm sorting the csvs and jsons in such a way that file_1.csv maps to file_1.json. It is working on the local system. Do have a look

from __future__ import annotations

import sys
import json
from pathlib import Path
import subprocess
import os
import paquo
from natsort import natsorted #add in dependencies?

def configure_qupath():

    try:
        from paquo.projects import QuPathProject
    except Exception as e:
        print(f"Couldn't find Qupath project with error: {e}")

        choice = input("""QuPath can be configured by setting the 'qupath_dir' field in '.paquo.toml'.
        You can also manually enter the path to your local QuPath installation.
        Do you want to enter manually? (Y[yes] or n[no]): 
        """)

        if choice is None or choice != 'Y':
            pass
        elif choice == 'Y':
            ### Converting the string to Path doesnt work. Gives TypeError: str expected, not PosixPath
            qupath_directory = input("Please enter the exact path where QuPath is installed: ")

            if Path(qupath_directory).exists():
                os.environ["PAQUO_QUPATH_DIR"] = str(qupath_directory) # setting the env var 
                paquo.settings.reload() # Reloading 
            else:
                print(f"QuPath Directory not found. Try again!")
                sys.exit(1)

def add_image_and_geojson(
    qupath_proj: QuPathProject, *, image_path: Path | str, geojson_path: Path | str
) -> None:
    with open(geojson_path) as f:
        # FIXME: check that a 'features' key is present and raise a useful error if not
        geojson_features = json.load(f)["features"]

    entry = qupath_proj.add_image(image_path)
    # FIXME: test that the 'load_geojson' function exists. If not, raise a useful error
    entry.hierarchy.load_geojson(geojson_features)  # type: ignore

# Store a list of matched slides and geojson files. Linking the slides and geojson in
# this way prevents a potential mismatch by simply listing directories and relying on
# the order to be the same.

def make_qupath_project(wsi_dir, results_dir):

    configure_qupath() # Sets the environment variable "PAQUO_QUPATH_DIR"
    try:
        from paquo.projects import QuPathProject
    except: 
        print("Unable to find Qupath! Run the program again")
        sys.exit(1)

    print("Found QuPath successfully!")
    QUPATH_PROJECT_DIRECTORY = "QuPathProject"

    csv_list = natsorted([str(file) for file in wsi_dir.iterdir() if file.is_file()])
    json_list = natsorted([str(file) for file in Path(f"{results_dir}/model-outputs-geojson").iterdir() if file.is_file()])

    slides_and_geojsons = [
        (csv, json) for csv, json in zip(csv_list, json_list)
    ]
    with QuPathProject(QUPATH_PROJECT_DIRECTORY, mode="w") as qp:
        for image_path, geojson_path in slides_and_geojsons:
            try:
                add_image_and_geojson(qp, image_path=image_path, geojson_path=geojson_path)
            except Exception as e:
                print(f"Failed to add image/geojson with error:: {e}")
    print("Successfully created QuPath Project!")

swaradgat19 commented 9 months ago

@kaczmarj Also wanted to ask. Should I add a separate test in test_all.py for this feature?

kaczmarj commented 8 months ago

instead of sorting the csv and json files to find matching pairs, i would get all of the csv files, and then build a list of json files using the stems of the csv files.
take care of those FIXME comments in add_image_and_geojson
i would suggest we do not try to configure the qupath location ourselves. instead, we should document how one sets the qupath directory. in many cases, Qupath should be found automatically, and we can tell users to use PAQUO_QUPATH_DIR if not. otherwise, things could get muddy and complicated.

swaradgat19 commented 8 months ago

About the sorting, I am using natsort to sort the filenames in the "natural alphabetic order" which ensures the files are sorted in the following fashion: file1, file2, ...file-n. You can find an example I referred to (here). It is sorting the files correctly, so I thought of using it.

Should I anyways do the pairing using the stems?

i would suggest we do not try to configure the qupath location ourselves. instead, we should document how one sets the qupath directory

Sure. I will change the code accordingly. I will try to find QuPath by default and if it isn't found, should I just throw an error and write the appropriate message on how to set the environment variable/toml?

kaczmarj commented 8 months ago

Should I anyways do the pairing using the stems?

yes. think about the case where, for whatever reason, one directory is missing a file that is present in the other directory. if you sort the filenames and match them by sorted position, you will have a frameshift.

I will try to find QuPath by default

let paquo do the work of finding qupath. and if an error is thrown that qupath cannot be found, then yes, throw an error with a message. something like

QuPath is required to use this functionality but it cannot be found. If QuPath is installed, please use define the environment variable PAQUO_QUPATH_DIR with the location of the QuPath installation. If QuPath is not installed, please install it from https://qupath.github.io/.

SBU-BMI / wsinfer

create a qupath project from the model inference results #162