visit-dav / summer-projects

A place to manage activity on summer projects
1 stars 0 forks source link

Identify a small proxy dataset to use on local machines #15

Closed markcmiller86 closed 2 years ago

markcmiller86 commented 2 years ago
Magicat-0 commented 2 years ago

I just sent both of you a google drive link with a possible proxy dataset. Let me know what you think (and if the link works😅)

markcmiller86 commented 2 years ago

I just sent both of you a google drive link with a possible proxy dataset. Let me know what you think (and if the link works😅)

I downloaded it. Where should I be looking. There are several folders. I randomly descended into manifest-1647114568231 - Copy/CPTAC-SAR/C3L-01466/05-08-2009-NA-CT LOWER EXTREMITY WITH CONTRAST LEFT-68226/2.000000-venous-28485 and looked at 1-189.dcm by using convert (https://www.imagemagick.org) to convert to PNG. It looks like some interesting data.

markcmiller86 commented 2 years ago

FYI...I am seeing a lot of these datasets use the DICOM image format (.dcm). VisIt doesn't currently read .dcm files directly. However, we can enhance it to do so fairly easily I think because we already read several image formats and we could easily adjust our image reader to add DICOM to the list

Magicat-0 commented 2 years ago

The manifest folder is the original ct data. It has all the 2D image slices separated in folders by reconstruction filters/algorithms optimized to visualize soft tissue, bone etc and other parameters. To actually see the 3D reconstruction of the stacked 2D slices you would have to use something like Slicer, Invesalius, CTvox, etc to render that 3D volume then save it as an .stl to be able to use it in most 3d modeling software. The file .stl files i sent in the link have already had the above steps so you can view it in 3D and modify it.

Yes, DICOM is the standard medical imaging format or something like that. I created the .stl from the stack of DICOM files. In the manual it says VisIt supports .stl but I tried and it wouldn't do it. Did i miss something?

markcmiller86 commented 2 years ago

I created the .stl from the stack of DICOM files.

How big of an stl file was it?

Binary or ascii?

markcmiller86 commented 2 years ago

If its small enough (<25Mb) to attach here, can you attach it? If you need to, adjust the filename with a GitHub supported attachment extension (e.g. if you have foo.stl then name it foo.stl.zip and attach that and just lemme know its not really zipped.

Magicat-0 commented 2 years ago

It's 139 MB. It should be in the google drive link i sent earlier, its the one that starts with C3L.

markcmiller86 commented 2 years ago

Ok, this is a binary STL file. Based in information about the format, each triangle is 50 bytes and the number of triangles in the file is given at byte offset 80 as a little-endian int. Using the od (octal dump) command on linux (see output below), the 2917697 is the number of triangles. Indeed, when I multiply that number by 50 and and 80 bytes for the header, I get 145884930 bytes which is the number of bytes in the file.

od -j 80 -D C3L-01466-2\ -\ Copy.stl | more
0000120           2917697      1015736163      3212664958      1041313178
0000140        1118896517      3276654920      1043550489      1118795479
0000160        3276654953      1049038775      1118795539      3276657429
0000200        3078728901       133365760       958971701      3541352243
0000220        1997749706      3541385903      2898641741      1993848705
 .
 .
 .
markcmiller86 commented 2 years ago

Using same approach as above, the bigtoe.stl file Prof. Mondy uploaded to our shared google drive has 3304108032 triangles in it which is 3.3 billion...thats huge! But, I also have an idea how to split it too with some simple command line tools.

markcmiller86 commented 2 years ago

I am considering this completed with the C3L dataset.