DIAL-RPI / FreehandUSRecon

Source code for DCL-Net, a deep learning model for sensorless freehand 3D ultrasound volume reconstruction.
MIT License
91 stars 19 forks source link

Mi' size 3x3 or 4x4 or 6x1? #1

Open vgonzalezd opened 3 years ago

vgonzalezd commented 3 years ago

Good morning, First of all, thank you for your work! I have a question about how you write My' in the demo_pos.txt file. Mi' is a matrix of 4x4 as is the result of Mi'=M(i+1)*Mi^-1. . In the article you say that you decompose Mi' into 6 degrees of freedom but in the txt you write 9 values. What representation do you use for the 9 values? Thank you in advance. Cordially Vanessa.

hengtaoguo commented 3 years ago

Hello Vanessa!

Thanks for trying our code! The 9 values (each row) in Demo_pos.txt is the positioning information captured by our ultrasound imaging device:

  1. The first value indicate the port number (0) of our tracking device. The second value shows if the tracking device is working properly (0: OK, 1: Error). These two values are just indicators and may not be useful in the algorithm.
  2. The following 7 values are grouped to indicate the position of the frame at this time point: (a) 3 values are translation along x, y and z (b) The rest 4 values is a quaternion with elements in the order of (x, y, z, w). This quaternion can then be converted to a rotation matrix.

In our code "tools.py" function "params_to_mat44", it coverts the 9 values explained above to a 4x4 transformation matrix. We get the matrices M(i) and M(i+1) at two consecutive time points, and compute their relative transformation Mi' as in the article. Then we decompose the Mi' into 6DOF to represent the relative transformation between two time points, and use this 6DOF as the label for these two frames.

Hope this can solve your problem, and please let us know if we can provide more help! Thanks!

Best, Hengtao

On 2020-11-11 04:14, vanessagd.2395 wrote:

Good morning, First of all, thank you for your work! I have a question about how you write My' in the demo_pos.txt file. Mi' is a matrix of 4x4 as is the result of Mi'=M(i+1)*Mi^-1. . In the article you say that you decompose Mi' into 6 degrees of freedom but in the txt you write 9 values. What representation do you use for the 9 values? Thank you in advance. Cordially Vanessa.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub [1], or unsubscribe [2].

Links:

[1] https://github.com/DIAL-RPI/FreehandUSRecon/issues/1 [2] https://github.com/notifications/unsubscribe-auth/AFI66UF5N33K6HORFMDIRI3SPJIYTANCNFSM4TRXVNTA

-- Hengtao Guo

3rd Year Ph.D. Student

Department of Biomedical Engineering

Center for Biotechnology and Interdisciplinary Studies (CBIS)

Rensselaer Polytechnic Institute

Email: guoh9@rpi.edu

jellyhazel commented 3 years ago

Hello Hengtao! Thanks for your sharing. Excellent job! I have a question about the file " dof_stats.txt" , which in the train_network.py. What does this file represent? How is it generated? Looking forward to your reply!

Many thanks Hazel

hengtaoguo commented 3 years ago

Hello Hengtao! Thanks for your sharing. Excellent job! I have a question about the file " dof_stats.txt" , which in the train_network.py. What does this file represent? How is it generated? Looking forward to your reply!

Many thanks Hazel

Hello Hazel,

Thanks for connecting with us!

The "dof_stats.txt" contains a 6*2 Numpy array. The 6 values in the first column are the mean values of 6 degrees-of-freedom (dof) of our training dataset, while the second column contains the standard deviation. We compute such statistics to do the standardization of our training set (line 405 in "train_network.py"). During the testing, we also use such statistics to recover the magnitude of the network's predictions (line 1081 & 1026 in "test_network.py").

Hope this answers your concerns, and please let us know if we can provide more help!

Best, Hengtao

jellyhazel commented 3 years ago

Thanks for the response!

When trying to run(line 432 in train_network.py) outputs = outputs.data.cpu().numpy() I get the following, AttributeError: 'tuple' object has no attribute 'data'

I print the shape: inputs shape torch.Size([1, 1, 5, 224, 224]) labels shape torch.Size([1, 6]) outputs[1] shape torch.Size([1, 2048, 1, 7, 7])

Any ideas as to why this might be happening? Thanks in advance.

hengtaoguo commented 3 years ago

Thanks for the response!

When trying to run(line 432 in train_network.py) outputs = outputs.data.cpu().numpy() I get the following, AttributeError: 'tuple' object has no attribute 'data'

I print the shape: inputs shape torch.Size([1, 1, 5, 224, 224]) labels shape torch.Size([1, 6]) outputs[1] shape torch.Size([1, 2048, 1, 7, 7])

Any ideas as to why this might be happening? Thanks in advance.

If you are using the network_type as "resnext", I am guessing because the model is returning a tuple (line 262 in networks/resnext.py) instead of just a torch tensor. You can change "outputs=outputs" to "outputs=outputs[0]” in both line 691 and 693 (train_network.py) to see if this helps.

jellyhazel commented 3 years ago

Thanks this works well! Many thanks! Hazel

jellyhazel commented 3 years ago

Hello hengtao! Please forgive me for many questions. 1.There are three types of input in train_network.py: org_img, diff_img, and optical flow, but there is no specific definition of optical flow. Do I understand it correctly? 2.What the definition of diff_img represents? 3.If I input the original image and the corresponding optical flow together, do you have any suggestions for the implementation of the code? Thank you in advance!

hengtaoguo commented 3 years ago

Hello hengtao! Please forgive me for many questions. 1.There are three types of input in train_network.py: org_img, diff_img, and optical flow, but there is no specific definition of optical flow. Do I understand it correctly? 2.What the definition of diff_img represents? 3.If I input the original image and the corresponding optical flow together, do you have any suggestions for the implementation of the code? Thank you in advance!

Hello Hazel! Great to hear from you!

  1. You are right. Currently, in train_network.py, there is no code implementation of optical flow. We carried out some experiments on optical flow and found it not helping too much in our project till now. Optical flow reveals in-plane motion well, but it cannot provide much out-of-plane motion since the tracking reference is lost during the elevational motion.

  2. In train_network.py line 376, you may find the implementation of diff_img. Let's say the input is 3 consecutive frames, "diff_img" computes the difference of "1st - 2nd", "2nd - 3rd", and makes the input only 2 channel difference images, rather than 3 channel original frames. Of note, this "diff_img" part was not included in our MICCAI contents, just for our private experiments. We intended to explore how the pure difference between two frames can help the network reveal the inter-frame motion. We are still working on the volume reconstruction project, and plan to evaluate more thoroughly from multiple aspects, probably including this "diff_img" setting. You are welcome to explore this term if interested!

  3. When applying optical flow to US reconstruction, you may refer to this article. My suggestion is that, you need to make sure that the optical flow algorithm you used is correct. There are some optical codes available in OpenCV. You can start from there and make sure it is working correctly by validating it on other toy projects such as object tracking. From then on, you can follow the previous articles' settings to find out how it works on your project.

Hope you find this helpful, and please let us know if you have any questions or thoughts! Thanks!

Hengtao

jellyhazel commented 3 years ago

Hello hengtao! Please forgive me for many questions. 1.There are three types of input in train_network.py: org_img, diff_img, and optical flow, but there is no specific definition of optical flow. Do I understand it correctly? 2.What the definition of diff_img represents? 3.If I input the original image and the corresponding optical flow together, do you have any suggestions for the implementation of the code? Thank you in advance!

Hello Hazel! Great to hear from you!

  1. You are right. Currently, in train_network.py, there is no code implementation of optical flow. We carried out some experiments on optical flow and found it not helping too much in our project till now. Optical flow reveals in-plane motion well, but it cannot provide much out-of-plane motion since the tracking reference is lost during the elevational motion.
  2. In train_network.py line 376, you may find the implementation of diff_img. Let's say the input is 3 consecutive frames, "diff_img" computes the difference of "1st - 2nd", "2nd - 3rd", and makes the input only 2 channel difference images, rather than 3 channel original frames. Of note, this "diff_img" part was not included in our MICCAI contents, just for our private experiments. We intended to explore how the pure difference between two frames can help the network reveal the inter-frame motion. We are still working on the volume reconstruction project, and plan to evaluate more thoroughly from multiple aspects, probably including this "diff_img" setting. You are welcome to explore this term if interested!
  3. When applying optical flow to US reconstruction, you may refer to this article. My suggestion is that, you need to make sure that the optical flow algorithm you used is correct. There are some optical codes available in OpenCV. You can start from there and make sure it is working correctly by validating it on other toy projects such as object tracking. From then on, you can follow the previous articles' settings to find out how it works on your project.

Hope you find this helpful, and please let us know if you have any questions or thoughts! Thanks!

Hengtao

Hello Hentao! Thank you very much for your reply, it is really helpful to me. I am now using your method to reconstruct the abdominal US (including the liver) of the fan scan, which is different from the ultrasound scan in your article (I guess). Yes, during the experiment, I found that increasing the optical flow really did not help much. I am currently looking for a way to improve accuracy and intend to make some improvements to the data processing method. If it works, I look forward to communicating with you. Sincerely appreciating your reply.

hengtaoguo commented 3 years ago

Hello Hentao! Thank you very much for your reply, it is really helpful to me. I am now using your method to reconstruct the abdominal US (including the liver) of the fan scan, which is different from the ultrasound scan in your article (I guess). Yes, during the experiment, I found that increasing the optical flow really did not help much. I am currently looking for a way to improve accuracy and intend to make some improvements to the data processing method. If it works, I look forward to communicating with you. Sincerely appreciating your reply.

You are very welcome! Looking forward to your progress!

QianqianCai commented 3 years ago

Hi Hengtao,

This is very interesting work! I have a quick question about the proposed algorithm.

Since image resolution can be a major concern for both ultrasound frames and imaging processing algorithms, I am wondering are there any resolution requirements for the input US image patches?

Thanks in advance for your reply!

hengtaoguo commented 3 years ago

Hello Qianqian, thanks for your question!

In practice, the clinicians may use different resolutions for the ultrasound imaging according to their needs. Currently, the DCL-net is trained on a dataset with mixed resolutions (4, 5, 6, 7.1, 8.1, 9 cm depth), so I would say there are no resolution requirements so far.

However, we are also considering this problem: whether train on one specific depth/resolution can help improve the performance. On the downside, this would make our dataset even smaller because the entire dataset will be divided into several groups. Nevertheless, we are still working towards the improved DCL-Net.

Hope you find this useful to you, thanks!

Hengtao

Hi Hengtao,

This is very interesting work! I have a quick question about the proposed algorithm.

Since image resolution can be a major concern for both ultrasound frames and imaging processing algorithms, I am wondering are there any resolution requirements for the input US image patches?

Thanks in advance for your reply!

vgonzalezd commented 3 years ago

Hello Hengtao, Thank you for always answering the messages. I would like to know where I can find the DCL-net implementation?. It is an excellent idea and I would like to try it on my data. I have already tried resnet50 and my3dnet. I have looked in the files networks.py, generators.py and mynet.py but I don't know if maybe the constructor is called differently?. In case it is not in the files I was wondering if you can share it? I tried to reproduce it but I am not very clear about the size of the filters and other parameters. Thank you for your answer. Cordially, Vanessa

hengtaoguo commented 3 years ago

Hello Hengtao, Thank you for always answering the messages. I would like to know where I can find the DCL-net implementation?. It is an excellent idea and I would like to try it on my data. I have already tried resnet50 and my3dnet. I have looked in the files networks.py, generators.py and mynet.py but I don't know if maybe the constructor is called differently?. In case it is not in the files I was wondering if you can share it? I tried to reproduce it but I am not very clear about the size of the filters and other parameters. Thank you for your answer. Cordially, Vanessa

Hi Vanessa,

The DCL-Net is actually here, which was set as the default network. Thank you for pointing this out and I should be clarifying this in our codes! Essentially, our DCL-Net is based on 3D ResNeXt with attention module. Hope this can be helpful and let us know if you have further questions!

vgonzalezd commented 3 years ago

Hi Hengtao, Thank you for your answer.

Leong1230 commented 2 years ago

Hello Hengtao! Thanks for your sharing. Excellent job! I have a question about the file " dof_stats.txt" , which in the train_network.py. What does this file represent? How is it generated? Looking forward to your reply! Many thanks Hazel

Hello Hazel,

Thanks for connecting with us!

The "dof_stats.txt" contains a 6*2 Numpy array. The 6 values in the first column are the mean values of 6 degrees-of-freedom (dof) of our training dataset, while the second column contains the standard deviation. We compute such statistics to do the standardization of our training set (line 405 in "train_network.py"). During the testing, we also use such statistics to recover the magnitude of the network's predictions (line 1081 & 1026 in "test_network.py").

Hope this answers your concerns, and please let us know if we can provide more help!

Best, Hengtao

Hi, Hengtao! Thank you for answering the questions. I would like to calculate the mean values and the standard deviation for my own data and train the network. Are these values correlated to the parameter 'neighbour_slice'? Since the output/label of the network is a average transformation across the neighbour slices. So do I need to recalculate the standarlized values if I change the 'neighbour_slice'? Thank you!