Closed hyebinyoo closed 7 months ago
And the paper says that "X-rays are generated by perspective projection". Do you have any related data? It is generally believed that X-rays are generated through orthogonal projection. And I can't find any data that says it's generated through perspective projection.
x_i represents the voxel coordinates in a 3D CT volume, normalized relative to the volume's resolution, which is commonly correspond to a fixed size in previous studies (e.g., 128x128x128). Thus, it is not additional information like additional X-ray or semantic map. x_i is conceptually similar to positional encoding in Transformer models.
We recommend you to refer to the following papers about projection. SHEN, Liyue; ZHAO, Wei; XING, Lei. Patient-specific reconstruction of volumetric computed tomography images from a single projection view via deep learning. Nature biomedical engineering, 2019, 3.11: 880-888. Shen, L., Yu, L., Zhao, W., Pauly, J., & Xing, L. (2022). Novel-view X-ray projection synthesis through geometry-integrated deep learning. Medical image analysis, 77, 102372.
In this paper, you say "our model reconstructs the 3D CT volume using only two X-rays (PA and lateral view images) without additional information." But PerX2CT uses additional information 3d positional information that I think voxel position.
That x_i is the voxel position, that is, the 3D coordinates in the target CT. Don't you need that information when inferring?