Data requirements - Githubissues

FengZhongLiuDong commented 5 months ago

Does SortCTM have specific requirements for WSI when running

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: input for the following indices index: 0 Got: 8 Expected: 1 index: 2 Got: 640 Expected: 1024 index: 3 Got: 640 Expected: 1024 Please fix either the inputs/outputs or the model.

FengZhongLiuDong commented 5 months ago

such as the magnification level of WSI

lely475 commented 5 months ago

Hi,

so this is actually an error due to me not updating the onnx models yet, sorry about that. -.- I have just replaced them in the one drive link, so you would need to re-download the tissue sgm and cell detection models, here's the link again. Let me know if you encounter any further issues!

And regarding the magnification: I am currently providing 2 cell detection models, one suited for 20x and one for 50x images, so you can set the mode in the parameters accordingly and it should automatically detect your WSI resolution and resize them accordingly.

FengZhongLiuDong commented 4 months ago

Thank you for your help. The previous issue has been resolved successfully. However, I encountered a new problem when running the program with a new inspection model: during the prediction process on whole slide images (WSI), an error occurs due to mismatched array dimensions, leading to failed broadcasting. Interestingly, when testing with different WSIs one by one, some run successfully while others still report the same error. I suspect it may be related to the size of the WSIs, and I am currently investigating the root cause. Once again, I appreciate your contributions to biomedical research and your attentive guidance.

mask[:, y : y + tile_size, x : x + tile_size] += pred

lely475 commented 4 months ago

Much welcome. Hmm, so we see a size mismatch by one pixel, I would assume the reason is due to a rounding of x and y leading to (in your case) y+tile_size > image_height by exactly 1 pixel.

You could try adding this, which checks if this is the case (height or width exceeded by exactly 1 pixel) and subtrascts 1 pixel for a correct tile size:

                _, height, width = mask.shape
                for pred, x, y in zip(pred_sgm, x_coords, y_coords):
                    assert round(x * wsi_info.f) - (width - tile_size - 1) <= 1
                    assert round(y * wsi_info.f) - (height - tile_size - 1) <= 1
                    x = min(round(x * wsi_info.f), width - tile_size - 1)
                    y = min(round(y * wsi_info.f), height - tile_size - 1)

In case this solves your problem, please let me know, and I will adjust the code accordingly.

And I expect the same adaption would be required for the lines following after this:

            # Detect cells in each tile
            for x_coords, y_coords in tqdm(
                zip(b_x, b_y), desc="Find cells", total=len(b_x)
            ):
                for x, y in zip(x_coords, y_coords):
                    assert round(x * wsi_info.f) - (width - tile_size - 1) <= 1
                    assert round(y * wsi_info.f) - (height - tile_size - 1) <= 1
                    x = min(round(x * wsi_info.f), width - tile_size - 1)
                    y = min(round(y * wsi_info.f), height - tile_size - 1)
                    pred = mask[:, y : y + tile_size, x : x + tile_size]

FengZhongLiuDong commented 4 months ago

I sincerely appreciate the valuable assistance you have provided, and your outstanding work is truly commendable Thank you for providing the solution, which successfully resolved my issue, as you anticipated, caused by rounding pixels. During runtime, I replaced the original lines in the file with: x = min(round(x * wsi_info.f), width - tile_size - 1) y = min(round(y * wsi_info.f), height - tile_size - 1) from: x, y = round(x * wsi_info.f), round(y * wsi_info.f) However, adjustments may need to be made to the subsequent dataset_level_csv method $OT$RUVV%T8QX_ZR{(0EOH_N$

lely475 commented 4 months ago

Great to hear this solved your problem! Did you progress with the other problem you encountered?

Here are my thoughts, in case you haven't been able to solve it so far: The dataset_level_csv method should not be affected by this change. What I see from your error is that there is a mismatch in length of the wsis, tc and bc lists. The tc and bc lists are created simultaneously, so the mismatch would be between the wsis and tc&bc lists. The wsis list contain the names of all files in the data_path, expecting all of them to be of WSI type. I would investigate in the following directions:

The existing detected_cells.csv file. In case you have already run analysis on a subset of slides the continue_run method checks which wsis were already processed and then only runs the SoftCTM algorithm on the missing ones. Maybe the mismatch comes from a faulty/ corrupted detected_cells.csv file?
The data_path directory containing non WSI files. Although this is unlikely, as loading the tiles with openslide should already result in an error previously.

If you haven't run analysis on a lot of slides already, I would otherwise recommend to delete the current outputs folder and start from a clean setup.

lely475 / ocelot23algo

Data requirements #2