visit-dav / summer-projects

A place to manage activity on summer projects
1 stars 0 forks source link

Play with downsampling of a larger dataset to something that is easier to use on a laptop #16

Closed markcmiller86 closed 1 year ago

markcmiller86 commented 2 years ago

Some possibilities

  1. https://visit-dav.github.io/largedata/datarchives/visit_data_files
  2. Mark to add one here.
markcmiller86 commented 2 years ago

Woohoo...we made our first image of a downsampled version of the vessels in the bigtoe dataset...

Screen Shot 2022-06-28 at 11 47 50 AM

The downsampled data is available at

/p/lustre1/miller86/ctdata/MicroCT_Datasets/22Mondy_BigToe_10um_4x4x4_downsample

The .imgvol file to see in VisIt is BigToe_IR_rec0000_4x4x4_downsample.imgvol

This is a very crude downsample. In fact its so crude it isn't very fair to call it a downsample. What I did was using ImageMagic convert routine like so...

Shell script to downsample each .tif file by 4 in x and y, add _smaller to name and put in folder resample_4x4

#!/bin/sh

# Should be 8906 of these
for f in BigToe_IR_rec00*.tif; do
    bname=$(basename $f .tif)
    echo $bname
    convert ${bname}.tif -resize 1008x1008 -compress lzw resample_4x4/${bname}_smaller.tif
done

Shell script to keep only every 4th .tif file in a new folder (resample_4x4x4)

#!/bin/sh

# Should be 8906 of these
$i = 0
for f in resample_4x4/BigToe_IR_rec00*.tif; do
    if [[ $((i%4)) -eq 0 ]]; then
        cp $f resample_4x4x4/.
        echo $f
    fi
    i=$((i+1))
done
markcmiller86 commented 2 years ago

@wmondy you may be interested to know about this progress

wmondy2 commented 2 years ago

Wow!!!

wmondy2 commented 2 years ago

That looks great Mark! How does it look when you zoom in to the capillary at the tip of the toe?

What is the size of the resulting model? (stl?)

wmondy2 commented 2 years ago

Great work Thank you Mark!

markcmiller86 commented 2 years ago

Thanks Prof. Mondy...it is great to get these first images out.

We didn't export an STL model. This was using VisIt's iso-volume operator at a threshold of around 15000 I think (or maybe it was 10000) and a cylindrical clip operator to remove the mounting case (well, most of it...you can still see a bit of it in this image).

Exporting STL would be a good thing to try but I suspect that will fail or the resulting file(s) will be uber-massive...many times bigger than your first attempt 😉 because a) we write STL in ascii (currently, I wanna update VisIt to do STL in binary) and b) we write multi-part files...you'd wind up with one STL file per processor that was being used to view the data (144 in this case). Still, it would be worth trying to see what we get.

So, its a nice first step. This input volumetric dataset is only 2.6 GB though. You could download it to your machine using the same commands we used to try to upload that saturday morning a few weeks back. @Magicat-0 (Wendy) has successfully done that and @corvette20 (Joshua) may do so later this week.

On windows, you would start WindowsPowershell and then do this command

scp -o MACs=hmac-sha2-512 "mondy1@pascal.llnl.gov:/p/lustre1/miller86/ctdata/MicroCT_Datasets/22Mondy_BigToe_10um_4x4x4_downsample/*.tif" .

Given my recollection of the bandwidth we were seeing when we did something similar, it could take 1-3 hours to complete. Also, I am just noticing those files are not yet readable by all users on the CZ. I am correcting that now but it could take an hour for permissions to update.

markcmiller86 commented 2 years ago

I've played around a bit more with the downsampled data in VisIt using VisIt's connected components features. I was able to use it somewhat successfully to weed out smaller, disconnected stuff, from larger, connected vessel structures. A resulting image is below...

Screen Shot 2022-06-28 at 6 35 44 PM

I can use connected components, I think (not entirely sure yet), to compute the volume of individual blobs (which for this dataset would be blood vessels) of connected stuff here. If I then threshold based on volume of those blobs, then I think I should be able to create datasets with only larger volume connected blobs. That is what I am doing here.

I then exported (and also did some save as...) to Wavefront OBJ, STL and VTK formats. For the example pictured here, those files are all around 1 GB in size. I put these in /p/lustre1/miller86/ctdata/MicroCT_Datasets/22Mondy_BigToe_10um_4x4x4_downsample/visit_exports

wmondy commented 2 years ago

It looks good, is much smaller, and is easy to transfer and for students to play around with on a laptop. The situation becomes we are losing critical data on the structure of a capillary bed; you can easily see the loss between the first model you presented and this one. And the first model, the 4x4x4 down-sampled, probably represents a loss of structure. We haven't seen the complete model of the entire data set make that assessment. We are re-creating vascular disease, which might not be a bad thing to model. But ultimately, the project's point is to create a complete and accurate model of the capillary bed system. The capillary bed keeps the tissue alive while nourishing the individual cells. Loss of the capillary bed is what occurs and ischemia resulting from angiopathy/vascular disease. Ischemia results in cell death and, as a result, tissue loss, In the brain, that equates to an infarct that creates a stroke; In the heart, it equates to a cardiac infarct that creates tissue loss resulting in a heart attack; in the foot ischemia results in neuropathy and the death of skin muscle and later bone tissue resulting in chronic foot ulcers that lead to gangrene and amputation.

So, our ultimate goal is to create a viable tissue structure that accurately represents the microvascular/capillary architecture that supports the original tissue architecture/or the functional architecture, using a scaffolding that accurately represents this architecture. A random architecture will not provide support for the structure of functional tissue. And without the microvascular/Capillary system, the tissue cells will not survive.

One last note is that with this vascular corrosion casting method, we probably already have in places, incomplete structures that we want to complete. Ultimately, we don't want to lose but gain structure.

My idea was that once we have a fully complete model, reduce its size by converting from triangles to NURBS surfaces.

markcmiller86 commented 2 years ago

My idea was that once we have a fully complete model, reduce its size by converting from triangles to NURBS surfaces.

So, you are right. We NEED to do that. I think we just need to arrange something like 32 nodes on quartz or pascal and do similar operations there with the full-res dataset. Part of having this smaller dataset is to understand (without the pain of waiting on large resource allocations) what works and doesn't as far as viewing and processing.

There are no interactive banks with 32 nodes so we'll need to submit to batch banks for this and it isn't always clear when the submission will launch. So, we may need to codify the operations here into a short python script we can submit for an overnight run.

markcmiller86 commented 2 years ago

Ok, so @corvette20 and/or @Magicat-0...would either of you be interested in doing the work to produce a python script that would perform steps similar to those above, in BATCH mode, so that you can submit it and get the results back the next morning?

Hint

You could easily use either VisIt's session file feature or its macro recording feature to make the actual python scripting part very simple.

markcmiller86 commented 2 years ago

My idea was that once we have a fully complete model, reduce its size by converting from triangles to NURBS surfaces

I am of the mind that its logistically and numerically more difficult to achieve quality data reduction from an extracted surface than it is from a properly down-sampled volumetric dataset. So, converting high-res volumetric to traingle surface first before trying to reduce that in some way wouldn't be my first choice.

That said, I have little recent experience with that to back up that perspective.

corvette20 commented 2 years ago

@markcmiller86, I would definitely like to take a crack at python code!

Magicat-0 commented 2 years ago

Ok, so @corvette20 and/or @Magicat-0...would either of you be interested in doing the work to produce a python script that would perform steps similar to those above, in BATCH mode, so that you can submit it and get the results back the next morning?

Hint

You could easily use either VisIt's session file feature or its macro recording feature to make the actual python scripting part very simple.

I would love to work on that. But i am not sure if i can have it for tomorrow morning. I will do my best!

Magicat-0 commented 2 years ago

I am confused so correct me if i am wrong but, our main project points and goals were:

"Our goal is to use the data and some of our own creativity to define what we want in a high fidelity animation showing blood flowing through a foot model of blood vessels.

I. We want to create some animations as well as understand the basic processes to integrate VisIt with popular animation tools and platforms via the USD file format. We want to include photorealistic representation in the animation and combine the results of two steps. We are going to use are VisIt combined with a popular animation tool or platform such as Omniverse, Maya, Blender using the USD file format. We don’t know how these parts are going to come together after the first week or two will have a clearer understanding of it.

II. If size of 3D model (USD files) becomes an issue, another area we will investigate is various data-reduction approaches to explore questions about integrating VisIt with downstream processing animation tools. In this case the data isn’t so important in applying these other tools ..." etc

Are we maybe not using the best choice of words to describe some of that? Because it sounds like the main goal has been changed from the above to creating a live and functional microvascular structure to support the tissue architecture of a limb.

wmondy commented 2 years ago

The purpose of using the HPC resources has always been to achieve visualization of a large data set that represents authentic 3-dimensional microvascular architecture, which includes the complete capillary bed system. And to achieve this goal on a relatively large volume of vascular tissue, in the case of vascular tissues found in a human foot. The capillary bed, the critical part of this structure, represents the functionality of the blood vascular system. Our main goal this summer has always been to model this functional architecture in a human foot using the ct and micro-ct image data I provided captured from a human vascular corrosion cast of a human foot.

We started with the big toe, and if time permits, we have four more toes, the sole and the larger vessels of the rest of the foot and lower leg.

The other goal, animation of this microvascular system, is undoubtedly one of the goals for the summer project. Still, animation of a large volume of vascular tissue is not a novel achievement on a low-resolution incomplete model which doesn’t include its capillary bed. Animation of an incomplete capillary bed system is not our goal unless we discover that it is impossible to achieve the 3-D reconstruction of the complete volume using VisIt and the HPC exascale resource available. With the results we have obtained, I am confident we can meet our goals.

Magicat-0 commented 2 years ago

@wmondy Yes i agree, one of our goals has been to visualize and process the microCT dataset from the corrosion cast and the rest of the project plan points outlined above. That is why i was confused when you mentioned our main goal was creating a live and functional microvascular structure to support the tissue architecture of a foot/leg.

Thank you for the clarification.

markcmiller86 commented 1 year ago

BTW, I found easies way to create .imgvol file

Be sure you are running a bash shell. Best way is to just run sh command...that will drop you into a new shell.

First, get into the folder where the .tif files you want to assemble into an .imgvol file are.

Then, drop into a bash shell (sh) and from there type in the following command(s) (its actually a very tiny shell script)

for f in *.tif; do echo $(pwd)/$f; done > ~/file.imgvol

You might need to adjust the *.tif filter to ensure you get only the files in the folder that are the slices.

The ~ in the output file name says to write that file to your home directory.

Inspect the file visually to ensure it appears correct.