mrview: uint32 and int16 overlay display issue

fionaEyoung commented 2 years ago

It seems that, on specific system environments, mrview displays overlay image voxels with value 1 as transparent when the image datatype is either uint32 or int16. This only happens when the image is loaded as an overlay, not when loaded directly.

The only reason I noticed this behaviour was when loading a fixel index.mif image, which is created as uint32le, and found the single fibre voxels empty (although the comments still report an overlay value of 1, see second image). Binary masks (ROIs or brain masks) are usually bitwise, and values other than 1 display fine.

datatypes fixel index

To Reproduce

open an image in mrview
choose an overlay image with datatype uint32{,le,be} or int16{,le,be}

Platform/Environment/Version

The issue was reproduced on two different (Intel) macOS systems:

-- 1:

OS: macOS 11.5.2
MRtrix3 version 3.0.3-41-g77326eaa
QT: 5.15.2
OpenGL: 4.1 ATI-4.6.20

-- 2:

OS: macOS 10.15.7
MRtrix3 version 3.0.3-37-gf633a02c
QT: 5.15.2
OpenGL: 4.1 INTEL-14.7.20

-- 3 The issue doesn't occur on my (M1) MBP:

OS: macOS 11.1
MRtrix3 version 3.0.3-41-g77326eaa
QT: 5.15.2
OpenGL: 4.1 Metal - 71.0.7

(In case it's relevant: systems 1 and 2 report Bit depths -> depth: 24 under mrview -> OpenGL information, while system 3 reports Bit depths -> depth: 32.)

(I updated MRtrix3 and retested before opening this issue, but prior to that I also tried downgrading MRtrix3 to whichever HEAD my MBP installation happened to be on before (which doesn't have the problem), and that made no difference.)

Advanced debugging information

for system 1

mrview -exit -debug
mrview: [DEBUG] No config file found at "/etc/mrtrix.conf"
mrview: [INFO] reading config file "/Users/fionayoung/.mrtrix.conf"...
mrview: [DEBUG] reading key/value file "/Users/fionayoung/.mrtrix.conf"...
mrview: [INFO] GL renderer:  AMD Radeon R9 M390 OpenGL Engine
mrview: [INFO] GL version:   4.1 ATI-4.6.20
mrview: [INFO] GL vendor:    ATI Technologies Inc.
mrview: [DEBUG] loading font into OpenGL texture...
mrview: [DEBUG] compiling OpenGL vertex shader:
#version 330 core
layout(location = 0) in vec2 pos;
layout(location = 1) in vec2 font_pos;
uniform float scale_x;
uniform float scale_y;
out vec2 tex_coord;
void main () {
  gl_Position = vec4 (pos[0]*scale_x-1.0, pos[1]*scale_y-1.0, 0.0, 1.0);
  tex_coord = font_pos;
}

mrview: [DEBUG] compiling OpenGL fragment shader:
#version 330 core
in vec2 tex_coord;
uniform sampler2D sampler;
uniform float red, green, blue;
out vec4 color;
void main () {
  color.ra = texture (sampler, tex_coord).rg;
  color.rgb = color.r * vec3 (red, green, blue);
}

mrview: [DEBUG] font loaded

jdtournier commented 2 years ago

Thanks for the report, @fionaEyoung. Unfortunately, I can't reproduce the problem on my system (64-bit Arch Linux, AMD Radeon RX 590 Series)... Can I just rule out the obvious: is there a threshold applied in the overlay tool? I can't see the sidebar in your screenshots... Assuming there are indeed no thresholds, then without access to a macOS system, I'm going to struggle to figure out what the problem is...

maxpietsch commented 2 years ago

I can confirm, 1s are not shown in the overlay if the datatype is uint32, uint32le, uint32be, int16, int16le, or int16be they are for all other datatypes (tested: float32 float32le float32be float64 float64le float64be int64 uint64 int64le uint64le int64be uint64be int32 uint32 int32le uint32le int32be uint32be int16 uint16 int16le uint16le int16be uint16be cfloat32 cfloat32le cfloat32be cfloat64 cfloat64le cfloat64be int8 uint8).

3.0.2-85-gabf12a87, GL version: 4.1 ATI-4.6.20 Qt 5.15.2

Linear integer ramp

as main image:

as overlay (also note zero is always black):

fionaEyoung commented 2 years ago

@jdtournier @maxpietsch The plot thickens... I created a test image, and it turns out that for uint16 its 2s that are not displayed 🤔

aah @maxpietsch you beat me to it, those ramps are way better!

maxpietsch commented 2 years ago

Can confirm the transparent 2 in uint16*:

jdtournier commented 2 years ago

OK, I had a quick look through the code to see if I could identify anything that might cause this... Nothing particularly obvious, other than the specific call to render the overlay: the overlay is always rendered using the call that allows arbitrary slice orientation (since we can't guarantee that the main image and the overlay lie on the same image grid) (GUI::ImageBase::render3D()), while the main image will by default be rendered using the call that cut through along the image axes (GUI::ImageBase::render2D()). There's otherwise very little difference between how the main image is treated vs. the overlay.

One way to verify whether this might have something to do with it is to display the image as a regular image (not overlay) and uncheck the 'lock to axes' button (or hit the 'L' key) - this forces the use of the render3D() call for the main image. If that also replicates the issue, then at least that'll give us a clue...

maxpietsch commented 2 years ago

bingo:

lock to axes False: lock to axes True:

jdtournier commented 2 years ago

OK, good to know. I'm really struggling to figure out what the problem might be. The data are copied over into a 3D texture on the GPU, stored in floating-point format with the equivalent bit depth ( i.e. UInt32LE would be copied over in GL_R32F texture). There are quite a few differences between the 2D and 3D texture updates, but given that everything works fine on my system, I suspect if there's an issue, it'll be in the OpenGL handling specifically. And the only difference on that score, really, is that we use a glTexImage3D() call for the 2D upload (all in one go), and a glTexSubImage3D() call for the 3D upload (one slice at a time). I can't think of anything that would cause this other than a straight out bug in the OpenGL call for the texture upload via glTexSubImage3D, specifically when integer types are being converted to floating-point internal representation in the process (which is what's happening here...).

Maybe you can send me some of those test files so I can triple-check on my side again, just in case?

maxpietsch commented 2 years ago

Test files used above: https://www.dropbox.com/s/8qa68v2ss0ypuzr/ramp.zip?dl=0

bjeurissen commented 2 years ago

On my Apple M1 machine I got the following results (using my own ramp, all little endian):

Main image: main

Overlay int8 (missing bottom 1 slice!): overlay_int8

Overlay int16 (missing bottom 2 slices!): overlay_int16

Overlay int32 (missing bottom 1 slice!): overlay_int32

Overlay uint8 (missing bottom 1 slice!): overlay_uint8

Overlay uint16 (missing bottom slices 1 and 3): overlay_uint16

Overlay uint32 (more weirdness...): overlay_uint32

Toggling 'lock to axis' did not seem to have any affect?!

bjeurissen commented 2 years ago

Could there be any relation with #2319?

jdtournier commented 2 years ago

Right, on my system, all of the test files display just fine - whether lock to axis is on or not, and whether displayed as overlays or not. There is clearly something specific about the macOS drivers that's not working as expected.

@maxpietsch: did you say the image displays fine when initially loaded, before unchecking the 'lock to axes' button? It's only when that button gets pressed that the issue occurs? That's messed up...

If we assume the CPU-side code runs identically on all systems, then the data being passed to the texture should be the same. Yet on your macOS systems, it clearly doesn't get handled correctly when provided via the glTexSubImage() call. From that point of view, @bjeurissen might have a point: there's a good chance this will relate to #2319. I have a feeling this is a driver bug. After a bit of digging, I came across this bug report from the Chromium project that seems very similar...

bjeurissen commented 2 years ago

When using glTexSubImage3D(), has the texture array been defined by a previous glTexImage3D() call? This appears to be a requirement according to https://docs.gl/gl3/glTexSubImage3D.

fionaEyoung commented 2 years ago

A couple more observations... As main image

The problem occurs only when 'lock to grid' is deactivated only in single slice and ortho view
The problem persists regardless of 'lock to grid' state in lightbox view and volume view (transparency range set to [0,1] )

Also, probably irrelevant, but with 'lock to grid' activated, I get a sort of "ghost image" (in the same colour as zero) when the focus is outside of the image volume (but still within an extended box, 111 pixels square for the 33 x 32 x 32 test ramps). With 'lock to grid' off, I just get black / transparent outside of the image range. Shown here for uint16: lock on: Screenshot 2021-11-24 at 11 58 49 lock off:

But this also occurs for datatypes not affected by the invisible-ones behaviour (e.g. float32) so I guess it's just a feature :)

jdtournier commented 2 years ago

Thanks, this is all consistent with the use of the render3D() pathway. And that last observation is also a consequence of that (though it could be fixed - might be worth doing in future).

When using glTexSubImage3D(), has the texture array been defined by a previous glTexImage3D() call? This appears to be a requirement according to https://docs.gl/gl3/glTexSubImage3D.

Just checked: this call is issued in the Volume::allocate() call (Volume is a base class of the Image class), which is itself called at this point within the Image::update_texture3D() call (itself called within ImageBase::render3D()). Pretty sure everything is fine from that point of view - indeed, I'd expect much more obvious issues on all platforms if that wasn't the case...

bjeurissen commented 2 years ago

I stumbled upon something interesting...

The tests I ran yesterday were performed with mrview bundled with the latest binary installer.

If I compile from source (master branch), the most pressing issues were resolved: the overlay is now just missing the bottom slice (value 0) for all uint types in the overlay, and everything is consistent between all uint types. So no more downright corrupted values.

So I am not sure what is causing this difference as the code has not really changed... Some thoughts:

When calling mrview from the standalone installer, it is actually running through the MRview.app bundle, so I was thinking this might have to do with the entry <key>NSHighResolutionCapable</key> <true/> in Info.plist. This would resonate with the chromium bug report that @jdtournier mentioned. I tried removing this line from the plist but this did not seem to change things. But I have the impression that these things are cached sometimes, so my test might not be conclusive... @maxpietsch @fionaEyoung are you using a version of mrview that has been compiled from source? Or the binary installed using the standalone installer?
The binary that I compiled today was built on the very latest version of macos with the very latest XCode and on ARM64, whereas the binary installer was probably built using Xcode from 1 or 2 generations ago and on x86_64... @maxpietsch @fionaEyoung: what XCode are you using?
Qt version did not change
The binary that I compiled today uses Eigen 3.4.0, whereas the one from the binary installer is using Eigen 3.3.9, but that sounds unlikely to be able to cause these issues...?

maxpietsch commented 2 years ago

@maxpietsch: did you say the image displays fine when initially loaded, before unchecking the 'lock to axes' button? It's only when that button gets pressed that the issue occurs? That's messed up...

Yes, toggling the button toggles the problem.

I am using mrview compiled from source (but I also have the MRview.app installed).

I updated my system since first testing this, the problem looks identical for mrtrix 3.0.2 with two driver versions:

OS X deployment target: 11.6, clang++: Apple clang version 13.0.0, GL: 4.1 ATI-4.6.20, Qt 5.15.2
OS X deployment target: 12.0, clang++: Apple clang version 13.0.0, GL: 4.1 ATI-4.7.29, Qt 5.15.2

bjeurissen commented 2 years ago

Left is using an x86_64 binary of the current master branch compiled from source using Xcode 13.1

Right is using an arm64 binary of the current master branch compiled from source using Xcode 13.1

Note that both are using the same OpenGL driver, same Qt version 5.15.2 and same eigen version 3.4.0.

Both running on the same arm64 machine.

fionaEyoung commented 2 years ago

I'm using MRtrix3 compiled from source. When I first noticed the behaviour I was on version 3.0.3-12-g55266be8, then I checkout out to the same earlier version as my MBP (which doesn't appear to have the problem), which made no difference, then I pulled the (then) latest version 3.0.3-41-g77326eaa which also changed nothing (that I noticed).

% pkgutil --pkg-info=com.apple.pkg.CLTools_Executables
package-id: com.apple.pkg.CLTools_Executables
version: 12.5.1.0.1.1623191612

% gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.5 (clang-1205.0.22.11)
Target: x86_64-apple-darwin20.6.0

(XCode command-line tools only, no app)

jdtournier commented 2 years ago

Note that both are using the same OpenGL driver, same Qt version 5.15.2 and same eigen version 3.4.0.

Both running on the same arm64 machine.

:exploding_head: this is so very messed up...

bjeurissen commented 2 years ago

🤯 this is so very messed up...

I did some further checking of the scenario in my last post.

All inputs to https://github.com/MRtrix3/mrtrix3/blob/994498557037c9e4f7ba67f255820ef84ea899d9/src/gui/mrview/volume.h#L127

are identical on both platforms. This includes the data variable for each slice, for which I did a binary comparison. So it really points to an inconsistency between the x86_64 and arm64 OpenGL driver, with the arm64 driver actually behaving better than the x86_64 one.

fionaEyoung commented 2 years ago

I just tried creating a brain mask using fsl's bet. The output was an int16 .nii.gz image and sure enough, invisible in mrview. The mask displayed just fine in FSLeyes though, which is also built on openGL (but written in python).

bjeurissen commented 2 years ago

@jdtournier: Would this be relevant: https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glGetInternalformat.xhtml ?

bjeurissen commented 1 year ago

On Apple M1, with macOS 12.6.1 (21G217) and OpenGL renderer 4.1 Metal - 76.3, I can no longer reproduce any of these issues (including the bottom missing slice) when I am building from source (3.0.3-464-g84e11a42) with Qt 5.15.6.

I can still reproduce the issue when using the version from the current binary installer (which was compiled for x86-64 rather than arm64).

bjeurissen commented 3 months ago

Still a problem on latest macOS 14.5 and OpenGL renderer 4.1 - 88.1.

To reproduce it on modern arm64 macs with universal binaries:

for f in ramp_float64.mif ramp_*int*.mif; do 
  for arch in x86_64 arm64; do
    for lock in 0 1; do
      arch -${arch} mrview ${f} -focus -95.7642,-73.8797,-63.4686 -colourmap 6 -lock ${lock} \
        -capture.prefix ${arch}_${f%.mif}_lock${lock}_ -capture.grab -exit
    done
  done
done

Will show that problematic rendering only occurs:

if 'lock to grid' is deactivated (or when overlaying) AND
if using 'x86_64' binaries AND
if datatype is int16, uint16, or uint32

Note that arch -x86_64 ${cmd} will silently use arm64 ${cmd} when native arch is arm64 platform and ${cmd} is not universal.

MRtrix3 / mrtrix3

mrview: uint32 and int16 overlay display issue #2399