How to train a model using real preoperative CT scans and intraoperative X-ray images taken with a C-arm machine？

eigenvivek / DiffPose

[CVPR 2024] Intraoperative 2D/3D registration via differentiable X-ray rendering

http://vivekg.dev/DiffPose/

MIT License

118 stars 14 forks source link

How to train a model using real preoperative CT scans and intraoperative X-ray images taken with a C-arm machine？ #21

Closed 98zhenyu closed 6 months ago

98zhenyu commented 7 months ago

Dear Vivek, I'm currently experimenting with your work and have some possibly basic questions. I hope you wouldn't mind enlightening me.

If I collect an X-ray image intraoperatively, how do I obtain its corresponding intrinsic and extrinsic parameters? Can this information be obtained from the manufacturer or the device itself?
How is the pose corresponding to this projection captured, in the world coordinate system or the camera coordinate system?
If the CT scan covers the entire spinal region, but only a portion of the lumbar vertebrae is captured in the intraoperative X-ray, how would one go about constructing DRR models and training in such a scenario? My questions might be a bit numerous, I hope you won't find me annoying, and provide valuable guidance.

eigenvivek commented 7 months ago

Hi @98zhenyu, if your intraoperative X-rays are DICOMs, it's possible to extract the intrinsic and extrinsic parameters. There are a few parameters you need from the metadata (SourceToDetectorDistance, SourceToPatientDistance, PrimaryPositionerAngle, SecondaryPositionerAngle, etc.). From here, you can construct an initial pose estimate (the exact details are a little complicated, I can describe more detail if that would be helpful).

The pose that you get from DICOM does not account for the motion of the patient, so you need to do some iterative optimization to refine the pose. This pose is in the world coordinate system.

Even if your CT captures more anatomy than your X-ray, I would start by trying to do the alignment as normal. In my experience, registration usually works even if there is a mismatch between the two modalities. If that doesn't work, you can try cropping the CT to better match the X-ray.

98zhenyu commented 7 months ago

Thank you very much for your prompt reply! Could you please describe in detail how to estimate the initial pose using parameters obtained from DICOM metadata？ Is further optimization achieved by treating the computed initial pose as 'pred_pose' and setting it as an instance variable of ”SparseRegistration“？ Finally, how should I use the parameters obtained from DICOM metadata to initialize and train the DRR model of CT? It seems to involve a lot of coordinate system transformations, and I'm a bit confused.

98zhenyu commented 7 months ago

In training the Deepfluoro dataset, the isocenter_pose is defined as follows： isocenter_rot = torch.tensor([[torch.pi / 2, 0.0, -torch.pi / 2]]) isocenter_xyz = torch.tensor(self.volume.shape) * self.spacing / 2 isocenter_xyz = isocenter_xyz.unsqueeze(0) self.isocenter_pose = RigidTransform( isocenter_rot, isocenter_xyz, "euler_angles", "ZYX" ) I can understand that the isocenter_pose is defined in the AP view direction at volume isocenter. However, I don't understand why in Appendix D of the Diffpose paper it states: 'DiffDRR initializes the camera at (f/2, 0, 0) pointed towards the negative x-direction.' I'm not sure why isocenter_pose corresponds to (f/2, 0, 0) pointed towards the negative x-direction. Am I misunderstanding something?

eigenvivek commented 6 months ago

closing with #24