SLF4J error and query about max_image_dim_pix and max_processed_image_dim_pix

MathOnco / valis

Virtual Alignment of pathoLogy Image Series

https://valis.readthedocs.io/en/latest/

MIT License

124 stars 29 forks source link

SLF4J error and query about max_image_dim_pix and max_processed_image_dim_pix #55

Open teacakedeadlift opened 1 year ago

teacakedeadlift commented 1 year ago

Hi @cdgatenbee

I just updated valis-wsi in my conda env on the cluster using pip install valis-wsi --update from 1.0.0rc12 to 1.0.0rc15. I've now started getting this new warning about the SLF4J logger.

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation`
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
JVM has been initialized. Be sure to call registration.kill_jvm() or slide_io.kill_jvm() at the end of your script.

I assume it won't impact the alignments? Not been able to work out why it's just started happening though. I tried rea

I've also noticed whatever value I set for max_processed_image_dim_px it always resets it for a value lower and divisible by 497. If I set it above the default value of max_image_dim_pix of 850 (e.g. to 5000) I get the following:

UserWarning: max_image_dim_px is 850 but needs to be less or equal to 5000. Setting max_image_dim_px to 5000

And then after converting images:

UserWarning: Smallest image is less than max_image_dim_px. parameter max_image_dim_px is being set to 3976
UserWarning: parameter max_processed_image_dim_px also being updated to 3976

(500 -> 497, 3000 -> 1988, 5000 -> 3976). Also not sure why it does this and currently checking the warped slides to see if anything terrible has happened.

Incidentally getting better alignments using max_processed_image_dim_px=500 for the rigid registration and max_non_rigid_registration_dim_px=3000 for the micro. Do you think this is the best approach for larger slides that may have tears, folds, or cut out tissue on some sections? Anecdotally it works, but it's difficult to check multiple micro-registration changes on multiple serial sections due to time and storage constraints - currently 14 sections with each output slide around 20Gb.

Thanks

Phil

cdgatenbee commented 1 year ago

Hi @teacakedeadlift, I've seen the same error with SLF4J logger, but haven't quite pinpointed the cause or found the fix. It doesn't seem to prevent the code from executing, or affect the registration quality, but it is something I'd like to resolve.

Regarding the downsampling, I'm guessing what is happening is that valis is looking for pyramid levels that are less than max_processed_image_dim_px=850, the largest of which has a maximum width or height of 497 in one of the images. Then the other images get resized to have a width/height of 497. A similar thing would be happening with max_non_rigid_registration_dim_px=5000, but in this case it's finding a pyramid level with a max width/height of 3976. I should also mention that max_processed_image_dim_px is the one that would affect the rigid registration quality (since it is what determines the size of the processed image used for rigid registration), and max_non_rigid_registration_dim_px will affect the non-rigid registration quality. However, max_image_dim_px only determines the size of the images saved with each Slide object (i.e. Slide.image), which can be useful if you'd like work with the thumbnails (maybe for debugging, making figures, etc...). Also, yes, I think performing rigid registration on the lower resolution image, and non-rigid/micro-registration on higher resolution images is a solid approach. It seems to work well because the rigid registration can only roughly get everything in place, but the higher resolution non-rigid registration can align those details that weren't present in the lower resolution image used for rigid registration. One downside is that the alignment error estimates get less accurate as the rigid image gets smaller, although it usually involves over-estimating the error (discussed here). Other than that, I think it should work OK.

Best, -Chandler

teacakedeadlift commented 1 year ago

Hi Chandler

Thanks.

I've seen the same error with SLF4J logger, but haven't quite pinpointed the cause or found the fix. It doesn't seem to prevent the code from executing, or affect the registration quality, but it is something I'd like to resolve.

Great I'll continue as I am and not worry too much about it.

One downside is that the alignment error estimates get less accurate as the rigid image gets smaller, although it usually involves over-estimating the error (discussed here).

Would this increase be offset by the higher max_non_rigid_registration_dim_px setting and better non-rigid registration? I confess I don't understand exactly how the error is being measured in relation to fixed and moving landmarks.

My alignments are working well, but I have a problem case where one slide section has a single solid piece of tissue and the remainder have a tear in the middle that has caused the two halves of tissue to move apart on the slide. As I align to one of the sections with the two halves the first section in the stack with one solid piece is failing to align (and subsequently causing issues with the other sections such as being resized massively). I've played around with *_dim_pix values but not having much luck. Is there a way of improving alignment of fragmented sections to unfragmented ones? Would altering the RANSAC value help? Or trying to follow the process used in ACROBAT using a custom ImageProcesser and followed by micro-registration using an image that was 25% of the full image’s size? (I should stress the poorly aligning images are always the H&Es failing to align to H-DABs).

Thanks

Phil

teacakedeadlift commented 1 year ago

To update, I kept rigid registration as is:

registrar = registration.Valis(slide_src_dir, results_dst_dir, max_processed_image_dim_px=500, imgs_ordered=True, align_to_reference=True, reference_img_f=reference_slide)

But for microregistration increased to ~25% of full image size:

registrar.register_micro(max_non_rigid_registration_dim_px=40000, align_to_reference=True, reference_img_f=reference_slide)

However, this raises two issues. The first I get a new warning due to memory use:

UserWarning: Registration would more than 10 GB if all images opened in memory. Will use NonRigidTileRegistrar to register cooresponding tiles to reduce memory consumption, but this method is experimental

It then starts to work through the tiles before the error:

UnboundLocalError: local variable 'bk_dxdy_from_ref' referenced before assignment

And then terminates.

Adjusting microregistration to:

registrar.register_micro(max_non_rigid_registration_dim_px=40000)

Solves this issue. This error never happened before with lower max_non_rigid_registration_dim_px that do not trigger the NonRigidTileRegistrar so I assume it's due to assignment of align_to_reference=True, reference_img_f=reference_slide in microregistration when NonRigidTileRegistrar is invoked?

This raises some questions:

Is align_to_reference usually better to be False for registration.Valis() and then altered to True for registrar.register_micro(), or True/False, True/True, False/False? I have been trying to optimise but with smaller images the alignment is good whatever I do, and on large images the time to run with high max_non_rigid_registration_dim_px impairs my ability to do multiple runs.
Is aligning towards the reference slide better than aligning to a reference slide if all slide sections are good quality and not missing areas, but aligning to a reference better if the reference slide is intact but intervening slides in the stack are lower quality (and so may have fewer landmarks to align to)?
Is using max_non_rigid_registration_dim_px=40000 for an image that is 105592x164356 a bit over the top? Run time is now very long and I am yet to compare the outcome to lower values but if alignment is much improved it will save time later so I am happy for longer run times unless there are diminishing returns and a lower value (say 10% of max image size) will yield comparable results.

Thanks

Phil

cdgatenbee commented 1 year ago

Hi @teacakedeadlift,

Thanks for reporting this error, I'll see if I can replicate it. It's safe to ignore that warning, as it's mostly there just to let you know that valis is going to do a tiled registration. Maybe after I fix this bug I'll remove the bit about it being experimental, since it will have been in valis for about a year. I'll next try to answer your questions:

In general, I think False/True is a good approach because aligning towards the reference in the original registration (align_to_reference=False in registration.Valis()) will get everything pretty closely aligned, and then aligning directly to the reference in the microregistration (align_to_reference=True in registrar.register_micro()) can nudge things to their location in the reference image. You could also try setting compose_non_rigid=True when you initialize the Valis object, which will apply the previous image's non-rigid transform to the current image before performing the registration. This can sometimes improve alignments with a reference image ("groupwise" in below figure), as opposed to aligning each image directly to the reference image ("pairwise" below) (see the difference in the region inside the blue box). However, accumulation of the non-rigid transforms can sometimes result in unwanted distortions, especially if some images are missing tissue regions, which is why the current default is compose_non_rigid=False. Still, it might be worth trying, especially if your slides are good quality. This won't affect the features that are used to estimate alignment error, so you could compare non-rigid errors in the summary csv to see if it improves things.

I haven't done this sort of experiment, but the approach you're suggesting seems sound.
It can be hard to say whether or not setting max_non_rigid_registration_dim_px=40000 is overkill, but there does appear to be a point of diminishing returns regarding increasing the size of the image used for non-rigid registration. For the ANHIR grand challenge, we tried using different resolutions and found that while there was a trend towards reduced error, it was fairly small (i.e. < 10 pixels), and in some cases the error actually increased. The figure below shows the change in error at different resolutions when compared to the default max_non_rigid_registration_dim_px=850. It was for these reasons, and the fact that performing the registration on very large images is so time consuming, that we didn't go above 25% resolution in the ACROBAT grand challenge.

I hope the above is helpful, and if you have any more questions, please don't hesitate to ask.

Best, -Chandler

teacakedeadlift commented 1 year ago

Hi @cdgatenbee

Thanks. FYI I also got another error relating to large memory usage I forgot to mention: Image size (780337152 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. warnings.warn(warning_msg, warning_type) PIL.Image.DecompressionBombError: Image size (780337152 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.

I got around this by setting MAX_IMAGE_PIXELS prior to registration: from PIL import Image Image.MAX_IMAGE_PIXELS = 2000000000

Thanks for the answers. From a pragmatic stand point max_non_rigid_registration_dim_px=40000 was set to take 9 days and I'm not sure it added that much. I'm now trialling around the level that triggers NonRigidTileRegistrar as anecdotally values below threshold result in a faster process that uses vastly more memory (so queue longer on the cluster) whereas values above use lower memory but take much longer. The additional output directorydisplacements .tiff files crash QuPath as they lack pyramids and so I run out of RAM (I think, my mac is ancient). Not sure how to add in pyramids within VALIS as I've yet to work out where the code is that generates them. Doing a separate step using bfconvert, but what exactly do these displacement tiffs represent (and how important are they)?

Thanks

Phil

cdgatenbee commented 1 year ago

Hi @teacakedeadlift,

I believe the warning is thrown by PIL and/or OpenSlide to let you know the files are huge, even though you may only be accessing a lower resolution level of the image pyramid. It should be safe to ignore, since valis often works with the smaller images, or uses pyvips when working with the larger ones to avoid loading the full image into memory.

The NonRigidTileRegistrar should use less memory because it only has to load a few small tiles at a time, but it does take much longer because, even though it's multithreaded, there is a lock that slows things down significantly. Unfortunately, without the lock, the displacement fields don't seem to get updated, and so there are frequently empty tiles. I'd like to find a way to do this without the lock so as to speed things up, but haven't had much luck so far.

Regarding the tiffs in the data/displacements directory, they are displacement fields generated by the non-rigid registration. So, they are very important, as valis needs to them warp the images. Looking at them in something like QuPath may not be that informative (which is why they aren't saved as pyramids), and would look something like below, where channel 1 has the x-axis displacements, and channel 2 has the y-axis displacements:

What might be more useful for visualization are the images in the deformation_fields directory, which are thumbnails showing how those same displacements warp the images and a mesh overlay, giving you an idea of which parts of the images are getting warped, and to what degree. For example, here is such an image that would correspond to the displacements above:

0_401427_BrdU

Please keep me updated on how you experiments varying max_non_rigid_registration_dim_px go. It would be nice to know whether or not you find that increasing max_non_rigid_registration_dim_px has a point of diminishing returns, as we found in the ANHIR dataset.

Best, -Chandler

teacakedeadlift commented 1 year ago

Hi @cdgatenbee,

Sorry for delay, I've been checking outputs and comparing various configurations.

For NonRigidTileRegistrar whatever I do drastically increases the overall run time so I think I am settling on values low enough to not trigger it (<22000).

With regards to diminishing returns, comparing rigid_D, rigid_rTRE, non_rigid_D, and non_rigid_rTRE across runs suggests accuracy reaches its maximum fairly early on (at least with my trial slides). With lower max_processed_image_dim_px errors of alignment that fail QC are reduced so I've been using 500. For max_non_rigid_registration_dim_px measured error tends to be similar across samples but quality of the summary images when manually viewed is best at around 15000. Lower values seem to do poorly/miss edges or thin pieces of tissue for some reason - one even cropped a corner off entirely.

This trial and error has raised some questions:

What is the best or most intuitive way to interpret rigid_D, rigid_rTRE, non_rigid_D, and non_rigid_rTRE as well as mean_original_D, mean_rigid_D, and mean_non_rigid_D? What is the "best" measure of how valis is performing?
Does setting max_non_rigid_registration_dim_px at a large value (~15000) within registration.Valis() do anything useful or should I just set it within registrar.register_micro()? As I assume there is the non_rigid step within registration.Valis() followed by the registrar.register_micro() non_rigid step? Or is this just redoing the same step at the same resolution? I have trialled it both ways and the run time is longer but the outputs are very similar with marginal changes - although I haven't saved slides and looked at it in detail.
If 2 above is true which non_rigid step (initial run or micro registration) does the output non_rigid_D , non_rigid_rTRE, etc relate to?

Thanks

Phil

cdgatenbee commented 1 year ago

Hi @teacakedeadlift, Thanks for sharing the results of your experiments. Your finding that the point of diminishing returns is reached early on is pretty consistent with what we found using the ANHIR dataset (plot from above). Regarding your other questions, I'll try answer each:

All of the error estimates are based on the distances between the matched landmarks/features valis uses for rigid registration. If physical units (e.g. μm) are in the image's metadata, the distance (the D measures) represents the physical distances between the points. Otherwise, the distances are in pixels, but you can see what the units are by checking the physical_units column in the csv. So, the original values are the distances between those features before registration (estimated after centering the images); rigid values are the distances after rigid registration; non_rigid are the distances between matched features after non-rigid registration. The TRE (target registration error) are those same D values, but divided by the target images' diagonals, standardizing the error between 0 and 1. If the physical units are available, then I think the D values are better than TRE, as they're easier to interpret. TRE can be useful if you're trying to compare registration accuracy between image sets that have different resolutions, since the error is scaled between 0 and 1. Having said all of that, given that those error measurements are based on the images used for rigid registration, which have a maximum width or height of max_processed_image_dim_px, they aren't as accurate as if the landmarks came from the full resolution images (like in the ANHIR and ACROBAT challenges). So, they're mostly just estimates to get a sense of if the registration worked, how consistent the accuracy among the slides is, as well as judging how changing other parameters altered the accuracy. Some good news is that it seems most of the time the estimated error is much higher than the actual error (check out the ACROBAT and 3D registration error comparisons on the Datasets page). I would like to find a way to estimate error more accurately though, and it's something I'm actively working on.
You're correct that there is a non-rigid step in registration.Valis(), and the size of the images used is determined by max_non_rigid_registration_dim_px. The registrar.register_micro() is just an additional, optional, non-rigid registration step if you'd like to use a higher resolution image than was used in registration.Valis(). So, calling registrar.register_micro() using the same max_non_rigid_registration_dim_px you used in registration.Valis() may not do much. Instead, you could either skip the registrar.register_micro() step, or have max_non_rigid_registration_dim_px be smaller in registration.Valis(), maybe half the size of what you would use in registrar.register_micro()?
If you skip registrar.register_micro(), then the non_rigid_D and non_rigid_rTRE will be from registration.Valis() (unless you also skip the non-rigid registration altogether by setting non_rigid_registrar_cls=None). However, if you run registrar.register_micro(), the results in the csv will be updated to reflect the changes in accuracy from micro-registration.

Please let me know if you have any other questions.

Best, -Chandler

cdgatenbee commented 1 year ago

Hi @teacakedeadlift, Thought I would let you know that the most recent update of valis (1.0.1) includes a higher resolution rigid registration. If nothing else, this method should provide more accurate error estimates since the position of features used to estimate alignment error come from much higher resolution versions of the image and should be more precise. While I haven't gotten around to it yet, I plan to compare these error estimates with the ground truth errors using the 2022 ACROBAT grand challenge dataset to determine how much (if any) of any improvement this makes in the alignment and error estimates. If you're interested trying it out in the meantime, check out the example here.

Best, -Chandler

teacakedeadlift commented 1 year ago

Hi Chandler,

Sounds great. I will give it a go once I figure out how to update valis in my conda virtual env as currently I am getting an error relating to a dependency when i try to upgrade using pip: ERROR: Packages installed from PyPI cannot depend on packages which are not also hosted on PyPI. valis-wsi depends on aicspylibczi@ git+https://github.com/AllenCellModeling/aicspylibczi.git

and when i try to install this dependency: pip install aicspylibczi Requirement already satisfied: aicspylibczi Requirement already satisfied: numpy>=1.14.1

Seems like an odd error to get and might be awkward to workaround.

Thanks

Phil

cdgatenbee commented 1 year ago

Hi @teacakedeadlift, Sorry about this, i think it's a problem with the pyproject.toml file that got uploaded to PyPi. Working to fix it now so hopefully it will be addressed soon.

Best, -Chandler

cdgatenbee commented 1 year ago

Hi @teacakedeadlift, It looks like PyPi and TestPyPi are under maintenance right now, and so I cant upload the package yet. In the meantime, maybe you could try using pip to install the most recent version directly from github?

pip install git+https://github.com/MathOnco/valis.git

Hoping to have this fixed on PyPi sometime today.

Best, -Chandler

teacakedeadlift commented 1 year ago

Hi @cdgatenbee

Success! install worked direct from github to a new virtual env. I will give it a go and see what happens. Presumably this use the tiling method so lower memory but needs longer runtime?

How does specifying max_processed_image_dim_px and max_non_rigid_registration_dim_px within registration.Valis() work if you specify micro_rigid_registrar_cls=MicroRigidRegistrar? Should they be removed?

Thanks

Phil

cdgatenbee commented 1 year ago

Great, so glad the github install worked :) The MicroRigidRegistrar is actually used to update the initial low resolution rigid registration parameters and image features, so both max_processed_image_dim_px and max_non_rigid_registration_dim_px will have the same effect as always. For example, max_processed_image_dim_px will have a big impact on whether or not the initial registration works, which in turn will affect the results of the MicroRigidRegistrar (e.g. if the initial rigid registration is bad, then MicroRigidRegistrar may not be able to make it much better). Also, I added a Valis.draw_matches method that will save images showing where the matches were found, so you could compare with and without MicroRigidRegistrar to see what parts of the images were used for the rigid registration and error estimation. Anyway, looking forward to seeing how it works for you. Not yet sure the final results will be dramatically different (need to benchmark with the ACROBAT dataset), but I think the error estimates should at least be better (can verify this with the ACROBAT data too).

Best, -Chandler