Open kaczmarj opened 1 year ago
I think I see it. We have a grep call at line 76 in commandLineAlign.R
that only keeps files that start with "prediction" in an effort to drop "color-" files. That returns no entries so everything "matches" because everything is nothing.
WSIinfer never has the prediction/color prefix, so I'll have a flag that if grep(prediction) returns 0 then don't run that trim. Sound good?
Changes made for TIL and Canc sections, shown below for TIL only Old Code:
tils = tils[grep("^prediction", tils)]
writeLines(" . . . Dropping low_res and color- files . . . ")
if(any(grepl("low_res", tils))){
tils = tils[-grep("low_res", tils)]
}
New Code:
if(length(grep("^prediction", tils))>0){ ## WSInfer outputs lack prefix, older outputs have prefix.
tils = tils[grep("^prediction", tils)]
}
writeLines(" . . . Dropping low_res and color- files . . . ")
if(any(grepl("low_res", tils))){
tils = tils[-grep("low_res", tils)]
}
is there a different path in the code to deal with wsinfer outputs? we would want to take that path if we detect that the files are from wsinfer. there's at least things we can test:
there should also be a message printed saying that it has found wsinfer outputs and will use those.
my only worry about assuming that we have wsinfer outputs if there are no files with prediction- prefixes is that if there are no files at all (or maybe the user passed a nested directory), then the error will be confusing.
Yes that'll just require a little shuffling but should be just as straightforward. Currently WSInfer detection is managed after parsing (and really is just a csv suffix check). See lymphFormatCsv object for that detection
Question @kaczmarj, does WSInfer spit any log files into the output directory? Something we would have to drop on a glob before running? I dont think so, but wanted to make sure
Yes, it creates several directories. model-outputs, stitches, patches, and a json file with runtime info.Best,JakubOn Apr 14, 2023, at 12:55, Luke Torre-Healy @.***> wrote: Question @kaczmarj, does WSInfer spit any log files into the output directory? Something we would have to drop on a glob before running? I dont think so, but wanted to make sure
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
here is a tree of wsinfer outputs. keep in mind that run_metadata_20230225T122426.json
includes a timestamp so the actual name will differ across runs.
results-wsinfer
├── masks
│ ├── TCGA-3L-AA1B-01Z-00-DX1.jpg
│ ├── TCGA-4N-A93T-01Z-00-DX1.jpg
│ ├── TCGA-4T-AA8H-01Z-00-DX1.jpg
│ ├── TCGA-5M-AAT4-01Z-00-DX1.jpg
│ ├── TCGA-5M-AAT5-01Z-00-DX1.jpg
│ ├── TCGA-5M-AAT6-01Z-00-DX1.jpg
│ ├── TCGA-5M-AATE-01Z-00-DX1.jpg
│ ├── TCGA-A6-2671-01Z-00-DX1.jpg
│ ├── TCGA-A6-2672-01Z-00-DX1.jpg
│ └── TCGA-A6-2674-01Z-00-DX1.jpg
├── model-outputs
│ ├── TCGA-3L-AA1B-01Z-00-DX1.csv
│ ├── TCGA-4N-A93T-01Z-00-DX1.csv
│ ├── TCGA-4T-AA8H-01Z-00-DX1.csv
│ ├── TCGA-5M-AAT4-01Z-00-DX1.csv
│ ├── TCGA-5M-AAT5-01Z-00-DX1.csv
│ ├── TCGA-5M-AAT6-01Z-00-DX1.csv
│ ├── TCGA-5M-AATE-01Z-00-DX1.csv
│ ├── TCGA-A6-2671-01Z-00-DX1.csv
│ ├── TCGA-A6-2672-01Z-00-DX1.csv
│ └── TCGA-A6-2674-01Z-00-DX1.csv
├── patches
│ ├── TCGA-3L-AA1B-01Z-00-DX1.h5
│ ├── TCGA-4N-A93T-01Z-00-DX1.h5
│ ├── TCGA-4T-AA8H-01Z-00-DX1.h5
│ ├── TCGA-5M-AAT4-01Z-00-DX1.h5
│ ├── TCGA-5M-AAT5-01Z-00-DX1.h5
│ ├── TCGA-5M-AAT6-01Z-00-DX1.h5
│ ├── TCGA-5M-AATE-01Z-00-DX1.h5
│ ├── TCGA-A6-2671-01Z-00-DX1.h5
│ ├── TCGA-A6-2672-01Z-00-DX1.h5
│ └── TCGA-A6-2674-01Z-00-DX1.h5
├── process_list_autogen.csv
├── run_metadata_20230225T122426.json
└── stitches
├── TCGA-3L-AA1B-01Z-00-DX1.jpg
├── TCGA-4N-A93T-01Z-00-DX1.jpg
├── TCGA-4T-AA8H-01Z-00-DX1.jpg
├── TCGA-5M-AAT4-01Z-00-DX1.jpg
├── TCGA-5M-AAT5-01Z-00-DX1.jpg
├── TCGA-5M-AAT6-01Z-00-DX1.jpg
├── TCGA-5M-AATE-01Z-00-DX1.jpg
├── TCGA-A6-2671-01Z-00-DX1.jpg
├── TCGA-A6-2672-01Z-00-DX1.jpg
└── TCGA-A6-2674-01Z-00-DX1.jpg
hi @lthealy - i am running the tumor-til analysis pipeline on wsinfer outputs. i'm getting an error that "no predictions had exact pairs".
i have attached a tar file with a small dataset (one slide) to reproduce this error.
data.tar.gz
the dataset has the following folder structure:
here is the error: