jamesdolezal / slideflow

Deep learning library for digital pathology, with both Tensorflow and PyTorch support.
https://slideflow.dev
GNU General Public License v3.0
230 stars 38 forks source link

[BUG] File path cannot be properly resolved on Windows #334

Open Mr-Milk opened 7 months ago

Mr-Milk commented 7 months ago

Description

When I create a Dataset on Windows platform, it cannot retrieved existed tfrecords.

To Reproduce

Create a dataset like the following, with tfrecords in the folder, the slideflow will return a empty list.

import pandas as pd
import slideflow as sf

annos = pd.read_csv(r"D:\projects\slideflow_project\annos.csv")
dataset = sf.Dataset(
    tfrecords=r"D:\projects\slideflow_project\tfrecords",
    slides=r"D:\data\svs",
    annotations=annos,
    tile_px=512,
    tile_um=256,
)

print(dataset.tfrecords())

Expected behavior

Slideflow should recognize and pick up all tfrecords

Environment:

Potential cause

I think this is related to how path is handled in slideflow

The following code will produce D:\\data\\data instead of data

from slideflow.util import path_to_name

path_to_name(r"D:\data\data.csv")

But if I create a Windows path using a Linux-style path, it will work.

I have a question, why not use the standard library pathlib to handle path instead of using the os

jamesdolezal commented 7 months ago

Thanks for this bug report. Slideflow is not officially supported or tested on Windows (see docs), but I think it would be fairly straightforward to resolve this issue. We try to be OS-agnostic where possible, but we don't have a Windows testing pipeline set up.

Switching to pathlib in the backend could be one solution for resolving Windows path issues. We'll do some digging and see what would be the best long-term solution for this.