Closed mabar1 closed 3 years ago
since I wanted to proceed with the analysis, I went and processed the ROIs with the nuclear pixel expansion method and fed these masks into the "Simple spatial 1 - add masks" script.
However, the do.create.outlines
function still produces the error we saw before:
> spatial.dat <- do.create.outlines(spatial.dat, 'cell.mask')
Creating polygons, outlines, and centroids using 'stars' method.
Processing masks for ROI ROI_1
although coordinates are longitude/latitude, st_union assumes that they are planar
Error: Problem with `summarise()` column `geometry`.
i `geometry = sf::st_union(geometry)`.
x Evaluation error: TopologyException: side location conflict at 1315 67.
i The error occurred in group 1: TEMP_MASK = 0.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
In st_is_longlat(x) :
bounding box has potentially an invalid value range for longlat data
edit:
using another mask, such as cell.mask.ext <- '_Nuclei_mask.tiff'
creates the same error, but the conflict happens at a different side:
Evaluation error: TopologyException: side location conflict at 17 311.
edit 2:
switching from stars to raster do.create.outlines(spatial.dat, 'cell.mask', method = "raster")
does not cause this error, but R simply runs out of RAM (32GB, R used 98% of it):
Creating polygons, outlines, and centroids using standard method -- this step may take some time, please be patient
Processing masks for ROI ROI_1
Error : memory exhausted (limit reached?)
Hi @tomashhurst sorry for posting so often. I just try to find a way to understand spectre..
I switched to the Advanced spatial workflow which is under development, but i just wanted to see how far I get with the multi-cut cell segmentation. Using these masks and the "Advanced spatial 1 - add masks" script, do.create.outlines
function works: I could add all three masks and create outlines for two of the three masks:
spatial.dat <- do.create.outlines(spatial.dat, 'cell.mask')
spatial.dat <- do.create.outlines(spatial.dat, 'cell.type')
spatial.dat <- do.create.outlines(spatial.dat, 'region') # <- this fails with same error in summarise() as above
and could also create the mask plots via make.spatial.plot
.
However extracting the signals using the cell.mask, do.extract(spatial.dat, 'cell.mask')
stops after segmentation of the channel signals:
(extracting all channels without error..)
... X208Pb_208Pb.omef
|================================================================================================================================================| 100%
... occurance in cell.type
... occurance in region
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘proj4string’ for signature ‘"NULL"’
Is this an error due to the fact that I did not manage to properly create an outline of the region?
Hi @mabar1 , sorry for the delayed reply! I think I've got an idea of what's going wrong for this run -- I'm having a look at the code and will get back to you shortly!
Hi @mabar1 , I have some updates for you:
v0.5.2
should have fixed a lot of installation issues AND spatial function issues, so try reinstalling Spectre and give it a shot.v0.5.2
and let me know if it works for you. (in terms of just creating the lists of files, '.tif' still shows any files that end in '.tiff' which is helpful, but when it's done inside the read.spatial.files
function it messes up the process.do.create.outlines
errors you mention above I was working on today, and that should work OK now. I'd like to know if you are able to get through this OK, and what operating system you are using -- there might be an additional non-R requirement on Mac.Let me know how that all goes for you, hopefully it should be sorted out!
@mabar1 I may have spoken too soon -- there is still an issue with of sf
and s2
are working -- fixed on my mac but not on the windows system in Spectre v0.5.2. Will have another look tomorrow!
@tomashhurst , ok no problem. Ill wait with loading spectre then. If you update, using install_github("immunedynamics/spectre", ref = 'spatial')’
this spatial branch is going to bring the updated version, or should i also update spectre from the immunedynamics repo? I just ask bc last time R built 'Spectre_0.4.7.tar.gz'
(see first post)?
I would just download from the main repo install_github("immunedynamics/spectre")
. At the moment it would be v0.5.2
but after the next update it will be v0.5.3
.
ok, yes I see that the spatial_workflows are now in the spectre repo as well. So I will use these scripts in there once 0.5.3 is online. thanks for the clarification
@mabar1 I think it's been resolved -- use install_github("immunedynamics/spectre")
to install the latest version (v0.5.3) and let me know how it goes.
@mabar1 v0.5.4 has some additional installation fixes.
@tomashhurst I made a clean install of v0.5.4 and let R update all dependencies.
In "Advanced spatial - 1 add masks" the only adaptation needed was read.spatial.files
when importing the ROI. A switch to ext = ".ome.tiff"
and it ran through. do.create.outlines
works for cell.mask, cell.type and region now.
In "Advanced spacial - 2 cellular analyis" took me a while to find out how cell.type.annot and region.annot works, but I got that eventually. When I process the data, I use asinh with a factor 5 since CyTOF folks told me to. And I use z-score normalization rather than rescale (again, CyTOF folks told me to). But apart from that the first two scripts run.
Thanks for that! Its too late to try the third script, "cluster annotation", so Ill leave that for later.
@mabar1 I'm hoping to have the online protocol for running through those written up properly during the week, which should help quite a bit!
no worries @tomashhurst Im fully aware that things are under construction. I feel like that guy that shows up to the party half an hour too early and then constantly asks where the chips are and why the snacks are still in the fridge. The thing that I have to have a look at now is why I cannot batch process the multicut in ilastik. Ill make a screenshot and write you a mail, ok? Then Ill downgrade from 1.3.3post3 and check. Because loading the whole ROI, 2000x2000px for a watershed is not the way to go: its all too small to control the segmentation and besides, ilastik crashes all the time. Ill have a look at that today. Thanks for your help again
So I up-scaled the data that spectre has to deal with and run into a problem in the read.spatial.files
function. This function regularly crashes out with
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'is.factor': cannot allocate vector of size 29.6 Mb
Given that it happens while reading the tiffs and at random tiff markers, but generally towards the end of reading the tiffs of a ROI, I thought that it might help to store chunks of ROIs into a temp.spatial.dat
and append afterwards. So I split the rois
into chunks of 5, read them in separately, then just appended that temp.spatial.dat to the big one and repeat over rois.
That helped to minimize the error for a while. But now that I got 51 ROIs in, the error persists.
Now the thing is that if I store the ROI that just caused the error into an empty spatial.dat
file, it imports just fine. Also, if I import the last few ROIs (say 41rd to 51st) into an empty spatial.dat
: no error. But if I then import from 1:51st, at ROI 43 it crashes out. And sure enough: If I revert the rois
(meaning I read in the last in the list first) the function works now with the ones that caused the error in the first place but R crashes with the ROIs that worked fine before.
So Im pretty sure this is a big-file handling problem or R.
edit: it was R running out of RAM. I simply increased mem size from 32603 MB to memory.limit(size=48000)
and now the script ran through in 4.8h
@mabar1 nice one! Yes that RStudio enforce vector limit thing can be pretty weird. So that was the only problem?
I've been reflecting on some better management and we have designs on managing more the spatial stuff using HDF5 (so the data is kept on disk during analysis). We have a working test version for cytometry analysis, but the spatial stuff might take a little longer to get it working.
I read in one comment that increasing mem size is dangerous, but I had not noticed any side effects yet. Plotting in Advanced Spatial 2 needs a lot of RAM, so I had to increase it there even further.
In the meantime I adapted the script to our purpose and also run phenograph and UMAP. and since the spatial.dat
, or rather the cell.dat
object are so nicely behaving objects, its easy to include more data in there. There are some minor things, such as a missing save.to.disk=F
option in the make.pheatmap
function. But this minor stuff.
But you are right, switching to hdf5 is a neat way. I processed the 51 ROIs just with the pixel classification routine, so I produced only tiff masks in CP3 (I had a well trained model in ilastik, so I went with that routine to handle this amount of ROIs). Meaning that I cannot easily create the region masks that the analysis asks for. But again, its rather straightforward to deactivate these steps in Advanced spatial 3
. There is a bug in the run.spatial.analysis
function, watch out: In order to work without the area.table mask, one has to set counts.per.area == FALSE
. This parameter is pre-set in the function, but the code does not have this condition. So I had to re-define the function anew and add:
if (counts.per.area == TRUE) { # <- this was missing, so if you dont have the area mask, it will fail.
message(" -- Calculating counts/area")
reg.res.by.area <- reg.res
for (u in regions) {
ar <- area.table[area.table[[sample.col]] == i,
u, with = FALSE]
ar <- ar[[1]]
ar <- as.numeric(ar)
reg.res.by.area[[u]] <- reg.res.by.area[[u]]/ar *
10000
}
reg.res.area.long <- melt(setDT(reg.res.by.area), id.vars = c("POPULATIONS"),
variable.name = "REGION")
reg.res.area.long
reg.res.area.long.new <- data.table()
reg.res.area.long.new$measure <- paste0("Cells per 100 sqr. pixels of ",
reg.res.area.long$REGION, " -- ", reg.res.area.long$POPULATIONS)
reg.res.area.long.new$counts <- reg.res.area.long$value
reg.res.area.long.new <- dcast(melt(reg.res.area.long.new,
id.vars = "measure"), variable ~ measure)
reg.res.area.long.new$variable <- NULL
}
and a bit further down the function:
sample.all <- cbind(samp.front,
reg.res.long.new,
#reg.res.area.long.new, # off since we dont have area.table masks
a.res.long.new,
b.res.long.new,
dist.res.list,
adj.res.list)
Again, its straight forward once you dig into the code. nice job!
Hi @tomashhurst I thought its easiest to open here a new issue to point towards some minor bugs I came across when working with spectre 0.5.
the first I noticed is after removing spectre and spectreMAP packages and re-installing spectre spatial branch via
install_github("immunedynamics/spectre", ref = 'spatial')’
R reportsso I asked myself if the spatial branch is fetching the proper version here?
I just went the multicut route, so I cannot say anything abt the other protocols. When exporting the tiffs from MCD viewer I tried both, 16 and 32bit ome.tiffs and both work. Maybe you can add that comment for others (just suggestions here :-) ) The "process TIFF" script works perfect, the protocol is missing a comment about the merge.channel step, but one figures that out. I like this feature actually a lot since it allows to merge channels that will help you detect the borders. Very nice idea.
During the multicut workflow in Ilastik there is a minor bug of ilastik. when using the border mask for segmentation ("multicut_2_segment"), ilastik 1.3 does not allow to load multiple datasets (see https://forum.image.sc/t/unable-to-load-multiple-datasets/38480/2), so one has to do that time-consuming step for every ROI. On the protocol it seems that you were able to load all .h5 files, at this point its not clear if you load the cropped or the raw files. Since the file above in the printscreen reads e.g. "ROI002-sta...bilities.h5", I assumed this means stacked and hence the cropped (the R script adds _crop to the filename. Since I could not process them as batch (nor could I use the crop as training and then batch process its raw file), I had to do the multicut on the raw files- one by one - which took quite a while..
I then went to simple analysis. The R-script "Simple spatial 1 - add masks" , or better the function
do.add.masks
requires the hdf5r package which is not preloaded (which is easily corrected by just add the library in the script). but it searches for a mask that is not generated:I cannot find this extension in any of the protocol so I assume this is a left-over from the old script? the script searches first for the masks by correctly scanning for:
but then filters for
edit: Am I correct, the scripts expects here the cell_mask.tiff from the cellprofiler nuclear pixel extension protocol, no?