IMS data visualization - aliasing in Viz tool

pecan88 commented 4 years ago

Feedback

Help Desk Issue Tracker URL: https://on.spiceworks.com/tickets/open/650/activity

Message:

Appears to be an issue with visualization of IMS data.

The only critical issue we see right now is that all IMS data shows significant aliasing in the Visualization tool within the portal. When you first view the data, it shows extreme stripping. However, if you zoom in it is clear that this is just an artifact of the viewer. There is a minor artifact in the real data that is being exaggerated during down sampling.

We are happy to talk more about this if my description is not clear.

ilan-gold commented 4 years ago

We can solve this by re-running @pecan88 - The fix is out AFAIK. I don't think it's a priority for us but tagging @ngehlenborg just to bring him in on this.

ngehlenborg commented 4 years ago

It is an issue with the image processing tool that generates the image pyramids. The developers of that tool are working on a fix but it has not been released yet: https://github.com/glencoesoftware/bioformats2raw/pull/54

ngehlenborg commented 4 years ago

@jswelling: This issue should be moved to the repo where data processing issues are being tracked. If you tell me which one that is, I can move this.

ilan-gold commented 4 years ago

The steps for the fix are (1 and 2 can be done whenever before 6):

Revert all MALDI-IMS datasets to New
remove all PROD derived image pyramids for MALDI-
Produce the new ome-tiff-pyramid branch, and run tests on TEST
Produce a PR to bump the devel version to that branch
Produce a new release of ingest-pipeline using the new branch, and update PROD and STAGE
Re-run all the MALDI-IMS datasets.

To get the build you will need to do something like:

wget https://ci.appveyor.com/api/buildjobs/654febn0xivg9fqn/artifacts/build%2Fdistributions%2Fbioformats2raw-0.2.6-SNAPSHOT.zip -O bioformats2raw.zip
unzip bioformats2raw.zip

in the Dockerfile here instead of the current method of downloading the repo and building (this is only for now, until this build is released officially and we can go back).

Then when we actually run the bioformats2raw command, we will need to add downsample-type=CUBIC to the command like bioformats2raw/bin/bioformats2raw --downsample-type=CUBIC... The code to change is: https://github.com/hubmapconsortium/ome-tiff-pyramid/blob/46d9926fc6d38a5f797c45421a36f552d7ce23e6/bin/ometiff_to_pyramid.py#L8-L18

Here are some datasets with nasty aliasing: https://portal.hubmapconsortium.org/browse/dataset/f84314c2044f4d5226d88b5e234378fb https://portal.hubmapconsortium.org/browse/dataset/db91d37984ce6a417dbf7347b684c6c1 https://portal.hubmapconsortium.org/browse/dataset/acd3c55b1a807eee9d660a55c2a7690b https://portal.hubmapconsortium.org/browse/dataset/83291f309f2f2855d8ec8167d913a25e

Here are some datasets with less aliasing to use as a way of making sure we're not reverting in some sense: https://portal.hubmapconsortium.org/browse/dataset/362918525f9725e6bb8fe5e0e5547ca6 https://portal.hubmapconsortium.org/browse/dataset/65aac96345f2fcc3610e163ccd039699

I think this covers everything - how to get the repo, the new parameter, and some sample data to test on that has both bad and less-bad aliasing.

jswelling commented 4 years ago

Test has been run on DEV; awaiting review of the resulting visualization.

ilan-gold commented 4 years ago

@jswelling It looks good to me. @ngehlenborg Here is the URL: https://portal.dev.hubmapconsortium.org/browse/dataset/95ae8b3e480a95c0564a55c2dcebf53a and here is the "reference" dataset that uses the old aggregation method: https://portal.hubmapconsortium.org/browse/dataset/e3bed3f4a4d4ca0cafb61415b8e21021

@jswelling I would like to do some more tests before approving this, including at least one on the list of "OK" datasets I posted Also, could you post the reference dataset going forward so we may compare?

Last comment is for @ngehlenborg - you'll notice that the sliders are different which makes sense given the different aggregation methods (we use the lowest-res image to calculate channel statistics).

jswelling commented 4 years ago

@ilan-gold OK, I'll import and run a couple of others from your original list.

ilan-gold commented 4 years ago

Thanks @jswelling!

jswelling commented 4 years ago

The two that I will use for additional tests are:

PROD dataset 6561d1c623a06c49ed5ecd68cb28cd15 HBM394.SCHF.437 with derived dataset f84314c2044f4d5226d88b5e234378fb HBM377.DKHM.428 . This one was 'nasty' above. The DEV clone of the parent dataset is 518b18c8a7cedda2676831117d6db8e9.

PROD dataset 598e80e7888712571caed6488c191302 HBM857.CTKH.498 with derived dataset 362918525f9725e6bb8fe5e0e5547ca6 HBM238.QHWB.479 . This one was 'ok' above. The DEV clone of the parent dataset is b55cc98678735cc8bdbeae6b64e91a5c .

jswelling commented 4 years ago

@ilan-gold the runs described above are done and visible on DEV. The dois are HBM889.CWHR.876 andHBM759.SKVM.625 for 'nasty' and 'ok' respectively.

ilan-gold commented 4 years ago

I am looking at the images and they appear to have a negative slider range that is (presumably) not present in the original dataset. On second thought, CUBIC may not be such a good idea for this reason - I will look more deeply into the algorithms and our options. I think LINEAR may end up being the best option. Thanks for these tests @jswelling - wish I had caught this sooner.

ilan-gold commented 4 years ago

@ngehlenborg Do you have an opinion on the above?

jswelling commented 4 years ago

@ilan-gold @ngehlenborg I've re-run the 3 IMS files on DEV with LINEAR interpolation. Could you have a look? They are:

55bc1557d83abc9a5552b18086ae9f87 HBM255.WWLN.796 original example
48390549843df4c910ace00f4fe27dbd HBM797.GTNZ.645 'nasty'
6eb3530d80e6fdb0852635029d396226 HBM839.HWLC.645 'ok'

The corresponding PROD derived datasets are:

original: e3bed3f4a4d4ca0cafb61415b8e21021
nasty: f84314c2044f4d5226d88b5e234378fb
ok: 362918525f9725e6bb8fe5e0e5547ca6

ilan-gold commented 4 years ago

@ngehlenborg These look good to me. Do they look good to you?

ngehlenborg commented 4 years ago

@ilan-gold or @jswelling Can you please provide direct links? I am not sure what is what and we need to get this approved by the Vanderbilt TMC team anyway.

ilan-gold commented 4 years ago

An image with previously bad aliasing:

New LINEAR iterpolation: https://portal.dev.hubmapconsortium.org/browse/dataset/48390549843df4c910ace00f4fe27dbd
Old downsampling: https://portal.hubmapconsortium.org/browse/dataset/f84314c2044f4d5226d88b5e234378fb

An image with previously not-so-bad aliasing:

New LINEAR iterpolation: https://portal.dev.hubmapconsortium.org/browse/dataset/6eb3530d80e6fdb0852635029d396226
Old downsampling: https://portal.hubmapconsortium.org/browse/dataset/362918525f9725e6bb8fe5e0e5547ca6

ngehlenborg commented 4 years ago

Thanks!

ngehlenborg commented 4 years ago

@jswelling @ilan-gold: Resolved on 8/28/2020 portal call with Jeff Spraggins. We will use linear aggregation.

hubmapconsortium / portal-ui

IMS data visualization - aliasing in Viz tool #1028

Feedback