Open teacakedeadlift opened 1 year ago
Hi @teacakedeadlift, I've seen the same error with SLF4J logger, but haven't quite pinpointed the cause or found the fix. It doesn't seem to prevent the code from executing, or affect the registration quality, but it is something I'd like to resolve.
Regarding the downsampling, I'm guessing what is happening is that valis is looking for pyramid levels that are less than max_processed_image_dim_px=850
, the largest of which has a maximum width or height of 497 in one of the images. Then the other images get resized to have a width/height of 497. A similar thing would be happening with max_non_rigid_registration_dim_px=5000
, but in this case it's finding a pyramid level with a max width/height of 3976. I should also mention that max_processed_image_dim_px
is the one that would affect the rigid registration quality (since it is what determines the size of the processed image used for rigid registration), and max_non_rigid_registration_dim_px
will affect the non-rigid registration quality. However, max_image_dim_px
only determines the size of the images saved with each Slide
object (i.e. Slide.image
), which can be useful if you'd like work with the thumbnails (maybe for debugging, making figures, etc...). Also, yes, I think performing rigid registration on the lower resolution image, and non-rigid/micro-registration on higher resolution images is a solid approach. It seems to work well because the rigid registration can only roughly get everything in place, but the higher resolution non-rigid registration can align those details that weren't present in the lower resolution image used for rigid registration. One downside is that the alignment error estimates get less accurate as the rigid image gets smaller, although it usually involves over-estimating the error (discussed here). Other than that, I think it should work OK.
Best, -Chandler
Hi Chandler
Thanks.
I've seen the same error with SLF4J logger, but haven't quite pinpointed the cause or found the fix. It doesn't seem to prevent the code from executing, or affect the registration quality, but it is something I'd like to resolve.
Great I'll continue as I am and not worry too much about it.
One downside is that the alignment error estimates get less accurate as the rigid image gets smaller, although it usually involves over-estimating the error (discussed here).
Would this increase be offset by the higher max_non_rigid_registration_dim_px
setting and better non-rigid registration? I confess I don't understand exactly how the error is being measured in relation to fixed and moving landmarks.
My alignments are working well, but I have a problem case where one slide section has a single solid piece of tissue and the remainder have a tear in the middle that has caused the two halves of tissue to move apart on the slide. As I align to one of the sections with the two halves the first section in the stack with one solid piece is failing to align (and subsequently causing issues with the other sections such as being resized massively). I've played around with *_dim_pix
values but not having much luck. Is there a way of improving alignment of fragmented sections to unfragmented ones? Would altering the RANSAC value help? Or trying to follow the process used in ACROBAT using a custom ImageProcesser
and followed by micro-registration using an image that was 25% of the full image’s size? (I should stress the poorly aligning images are always the H&Es failing to align to H-DABs).
Thanks
Phil
To update, I kept rigid registration as is:
registrar = registration.Valis(slide_src_dir, results_dst_dir, max_processed_image_dim_px=500, imgs_ordered=True, align_to_reference=True, reference_img_f=reference_slide)
But for microregistration increased to ~25% of full image size:
registrar.register_micro(max_non_rigid_registration_dim_px=40000, align_to_reference=True, reference_img_f=reference_slide)
However, this raises two issues. The first I get a new warning due to memory use:
UserWarning: Registration would more than 10 GB if all images opened in memory. Will use NonRigidTileRegistrar to register cooresponding tiles to reduce memory consumption, but this method is experimental
It then starts to work through the tiles before the error:
UnboundLocalError: local variable 'bk_dxdy_from_ref' referenced before assignment
And then terminates.
Adjusting microregistration to:
registrar.register_micro(max_non_rigid_registration_dim_px=40000)
Solves this issue. This error never happened before with lower max_non_rigid_registration_dim_px
that do not trigger the NonRigidTileRegistrar
so I assume it's due to assignment of align_to_reference=True, reference_img_f=reference_slide
in microregistration when NonRigidTileRegistrar
is invoked?
This raises some questions:
align_to_reference
usually better to be False for registration.Valis()
and then altered to True for registrar.register_micro()
, or True/False, True/True, False/False? I have been trying to optimise but with smaller images the alignment is good whatever I do, and on large images the time to run with high max_non_rigid_registration_dim_px
impairs my ability to do multiple runs.max_non_rigid_registration_dim_px=40000
for an image that is 105592x164356 a bit over the top? Run time is now very long and I am yet to compare the outcome to lower values but if alignment is much improved it will save time later so I am happy for longer run times unless there are diminishing returns and a lower value (say 10% of max image size) will yield comparable results.Thanks
Phil
Hi @teacakedeadlift,
Thanks for reporting this error, I'll see if I can replicate it. It's safe to ignore that warning, as it's mostly there just to let you know that valis is going to do a tiled registration. Maybe after I fix this bug I'll remove the bit about it being experimental, since it will have been in valis for about a year. I'll next try to answer your questions:
align_to_reference=False
in registration.Valis()
) will get everything pretty closely aligned, and then aligning directly to the reference in the microregistration (align_to_reference=True
in registrar.register_micro()
) can nudge things to their location in the reference image. You could also try setting compose_non_rigid=True
when you initialize the Valis
object, which will apply the previous image's non-rigid transform to the current image before performing the registration. This can sometimes improve alignments with a reference image ("groupwise" in below figure), as opposed to aligning each image directly to the reference image ("pairwise" below) (see the difference in the region inside the blue box). However, accumulation of the non-rigid transforms can sometimes result in unwanted distortions, especially if some images are missing tissue regions, which is why the current default is compose_non_rigid=False
. Still, it might be worth trying, especially if your slides are good quality. This won't affect the features that are used to estimate alignment error, so you could compare non-rigid errors in the summary csv to see if it improves things. I haven't done this sort of experiment, but the approach you're suggesting seems sound.
It can be hard to say whether or not setting max_non_rigid_registration_dim_px=40000
is overkill, but there does appear to be a point of diminishing returns regarding increasing the size of the image used for non-rigid registration. For the ANHIR grand challenge, we tried using different resolutions and found that while there was a trend towards reduced error, it was fairly small (i.e. < 10 pixels), and in some cases the error actually increased. The figure below shows the change in error at different resolutions when compared to the default max_non_rigid_registration_dim_px=850
. It was for these reasons, and the fact that performing the registration on very large images is so time consuming, that we didn't go above 25% resolution in the ACROBAT grand challenge.
I hope the above is helpful, and if you have any more questions, please don't hesitate to ask.
Best, -Chandler
Hi @cdgatenbee
Thanks. FYI I also got another error relating to large memory usage I forgot to mention:
Image size (780337152 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. warnings.warn(warning_msg, warning_type)
PIL.Image.DecompressionBombError: Image size (780337152 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
I got around this by setting MAX_IMAGE_PIXELS
prior to registration:
from PIL import Image
Image.MAX_IMAGE_PIXELS = 2000000000
Thanks for the answers. From a pragmatic stand point max_non_rigid_registration_dim_px=40000
was set to take 9 days and I'm not sure it added that much. I'm now trialling around the level that triggers NonRigidTileRegistrar
as anecdotally values below threshold result in a faster process that uses vastly more memory (so queue longer on the cluster) whereas values above use lower memory but take much longer. The additional output directorydisplacements
.tiff files crash QuPath as they lack pyramids and so I run out of RAM (I think, my mac is ancient). Not sure how to add in pyramids within VALIS as I've yet to work out where the code is that generates them. Doing a separate step using bfconvert, but what exactly do these displacement tiffs represent (and how important are they)?
Thanks
Phil
Hi @teacakedeadlift,
I believe the warning is thrown by PIL and/or OpenSlide to let you know the files are huge, even though you may only be accessing a lower resolution level of the image pyramid. It should be safe to ignore, since valis often works with the smaller images, or uses pyvips when working with the larger ones to avoid loading the full image into memory.
The NonRigidTileRegistrar
should use less memory because it only has to load a few small tiles at a time, but it does take much longer because, even though it's multithreaded, there is a lock that slows things down significantly. Unfortunately, without the lock, the displacement fields don't seem to get updated, and so there are frequently empty tiles. I'd like to find a way to do this without the lock so as to speed things up, but haven't had much luck so far.
Regarding the tiffs in the data/displacements
directory, they are displacement fields generated by the non-rigid registration. So, they are very important, as valis needs to them warp the images. Looking at them in something like QuPath may not be that informative (which is why they aren't saved as pyramids), and would look something like below, where channel 1 has the x-axis displacements, and channel 2 has the y-axis displacements:
What might be more useful for visualization are the images in the deformation_fields
directory, which are thumbnails showing how those same displacements warp the images and a mesh overlay, giving you an idea of which parts of the images are getting warped, and to what degree. For example, here is such an image that would correspond to the displacements above:
Please keep me updated on how you experiments varying max_non_rigid_registration_dim_px
go. It would be nice to know whether or not you find that increasing max_non_rigid_registration_dim_px
has a point of diminishing returns, as we found in the ANHIR dataset.
Best, -Chandler
Hi @cdgatenbee,
Sorry for delay, I've been checking outputs and comparing various configurations.
For NonRigidTileRegistrar
whatever I do drastically increases the overall run time so I think I am settling on values low enough to not trigger it (<22000).
With regards to diminishing returns, comparing rigid_D
, rigid_rTRE
, non_rigid_D
, and non_rigid_rTRE
across runs suggests accuracy reaches its maximum fairly early on (at least with my trial slides). With lower max_processed_image_dim_px
errors of alignment that fail QC are reduced so I've been using 500. For max_non_rigid_registration_dim_px
measured error tends to be similar across samples but quality of the summary images when manually viewed is best at around 15000. Lower values seem to do poorly/miss edges or thin pieces of tissue for some reason - one even cropped a corner off entirely.
This trial and error has raised some questions:
rigid_D
, rigid_rTRE
, non_rigid_D
, and non_rigid_rTRE
as well as mean_original_D
, mean_rigid_D
, and mean_non_rigid_D
? What is the "best" measure of how valis is performing?max_non_rigid_registration_dim_px
at a large value (~15000) within registration.Valis()
do anything useful or should I just set it within registrar.register_micro()
? As I assume there is the non_rigid step within registration.Valis()
followed by the registrar.register_micro()
non_rigid step? Or is this just redoing the same step at the same resolution? I have trialled it both ways and the run time is longer but the outputs are very similar with marginal changes - although I haven't saved slides and looked at it in detail.non_rigid_D
, non_rigid_rTRE
, etc relate to?Thanks
Phil
Hi @teacakedeadlift, Thanks for sharing the results of your experiments. Your finding that the point of diminishing returns is reached early on is pretty consistent with what we found using the ANHIR dataset (plot from above). Regarding your other questions, I'll try answer each:
D
measures) represents the physical distances between the points. Otherwise, the distances are in pixels, but you can see what the units are by checking the physical_units
column in the csv. So, the original
values are the distances between those features before registration (estimated after centering the images); rigid
values are the distances after rigid registration; non_rigid
are the distances between matched features after non-rigid registration. The TRE
(target registration error) are those same D
values, but divided by the target images' diagonals, standardizing the error between 0 and 1. If the physical units are available, then I think the D
values are better than TRE
, as they're easier to interpret. TRE
can be useful if you're trying to compare registration accuracy between image sets that have different resolutions, since the error is scaled between 0 and 1. Having said all of that, given that those error measurements are based on the images used for rigid registration, which have a maximum width or height of max_processed_image_dim_px
, they aren't as accurate as if the landmarks came from the full resolution images (like in the ANHIR and ACROBAT challenges). So, they're mostly just estimates to get a sense of if the registration worked, how consistent the accuracy among the slides is, as well as judging how changing other parameters altered the accuracy. Some good news is that it seems most of the time the estimated error is much higher than the actual error (check out the ACROBAT and 3D registration error comparisons on the Datasets page). I would like to find a way to estimate error more accurately though, and it's something I'm actively working on.registration.Valis()
, and the size of the images used is determined by max_non_rigid_registration_dim_px
. The registrar.register_micro()
is just an additional, optional, non-rigid registration step if you'd like to use a higher resolution image than was used in registration.Valis()
. So, calling registrar.register_micro()
using the same max_non_rigid_registration_dim_px
you used in registration.Valis()
may not do much. Instead, you could either skip the registrar.register_micro()
step, or have max_non_rigid_registration_dim_px
be smaller in registration.Valis()
, maybe half the size of what you would use in registrar.register_micro()
? registrar.register_micro()
, then the non_rigid_D
and non_rigid_rTRE
will be from registration.Valis()
(unless you also skip the non-rigid registration altogether by setting non_rigid_registrar_cls=None
). However, if you run registrar.register_micro()
, the results in the csv will be updated to reflect the changes in accuracy from micro-registration. Please let me know if you have any other questions.
Best, -Chandler
Hi @teacakedeadlift, Thought I would let you know that the most recent update of valis (1.0.1) includes a higher resolution rigid registration. If nothing else, this method should provide more accurate error estimates since the position of features used to estimate alignment error come from much higher resolution versions of the image and should be more precise. While I haven't gotten around to it yet, I plan to compare these error estimates with the ground truth errors using the 2022 ACROBAT grand challenge dataset to determine how much (if any) of any improvement this makes in the alignment and error estimates. If you're interested trying it out in the meantime, check out the example here.
Best, -Chandler
Hi Chandler,
Sounds great. I will give it a go once I figure out how to update valis in my conda virtual env as currently I am getting an error relating to a dependency when i try to upgrade using pip:
ERROR: Packages installed from PyPI cannot depend on packages which are not also hosted on PyPI. valis-wsi depends on aicspylibczi@ git+https://github.com/AllenCellModeling/aicspylibczi.git
and when i try to install this dependency:
pip install aicspylibczi
Requirement already satisfied: aicspylibczi
Requirement already satisfied: numpy>=1.14.1
Seems like an odd error to get and might be awkward to workaround.
Thanks
Phil
Hi @teacakedeadlift, Sorry about this, i think it's a problem with the pyproject.toml file that got uploaded to PyPi. Working to fix it now so hopefully it will be addressed soon.
Best, -Chandler
Hi @teacakedeadlift, It looks like PyPi and TestPyPi are under maintenance right now, and so I cant upload the package yet. In the meantime, maybe you could try using pip to install the most recent version directly from github?
pip install git+https://github.com/MathOnco/valis.git
Hoping to have this fixed on PyPi sometime today.
Best, -Chandler
Hi @cdgatenbee
Success! install worked direct from github to a new virtual env. I will give it a go and see what happens. Presumably this use the tiling method so lower memory but needs longer runtime?
How does specifying max_processed_image_dim_px
and max_non_rigid_registration_dim_px
within registration.Valis()
work if you specify micro_rigid_registrar_cls=MicroRigidRegistrar
? Should they be removed?
Thanks
Phil
Great, so glad the github install worked :) The MicroRigidRegistrar
is actually used to update the initial low resolution rigid registration parameters and image features, so both max_processed_image_dim_px
and max_non_rigid_registration_dim_px
will have the same effect as always. For example, max_processed_image_dim_px
will have a big impact on whether or not the initial registration works, which in turn will affect the results of the MicroRigidRegistrar
(e.g. if the initial rigid registration is bad, then MicroRigidRegistrar
may not be able to make it much better). Also, I added a Valis.draw_matches
method that will save images showing where the matches were found, so you could compare with and without MicroRigidRegistrar
to see what parts of the images were used for the rigid registration and error estimation. Anyway, looking forward to seeing how it works for you. Not yet sure the final results will be dramatically different (need to benchmark with the ACROBAT dataset), but I think the error estimates should at least be better (can verify this with the ACROBAT data too).
Best, -Chandler
Hi @cdgatenbee
I just updated
valis-wsi
in my conda env on the cluster usingpip install valis-wsi --update
from1.0.0rc12
to1.0.0rc15
. I've now started getting this new warning about the SLF4J logger.I assume it won't impact the alignments? Not been able to work out why it's just started happening though. I tried rea
I've also noticed whatever value I set for
max_processed_image_dim_px
it always resets it for a value lower and divisible by 497. If I set it above the default value ofmax_image_dim_pix
of 850 (e.g. to 5000) I get the following:And then after converting images:
(500 -> 497, 3000 -> 1988, 5000 -> 3976). Also not sure why it does this and currently checking the warped slides to see if anything terrible has happened.
Incidentally getting better alignments using
max_processed_image_dim_px=500
for the rigid registration andmax_non_rigid_registration_dim_px=3000
for the micro. Do you think this is the best approach for larger slides that may have tears, folds, or cut out tissue on some sections? Anecdotally it works, but it's difficult to check multiple micro-registration changes on multiple serial sections due to time and storage constraints - currently 14 sections with each output slide around 20Gb.Thanks
Phil