adler-j / learned_primal_dual

Learned Primal-Dual Reconstruction
https://arxiv.org/abs/1707.06474
92 stars 36 forks source link

generate_data() is called every 10th iteration and - single entry validation set #2

Open alisiahkoohi opened 6 years ago

alisiahkoohi commented 6 years ago

Why do you generate new training data pairs just every 10th iteration? For instance here.

Also this line suggests that the validation error is only being evaluated over a single data pair. So technically you validation set contains a single data pair.

adler-j commented 6 years ago

Both observations are true.

I did try generating data in every iteration, but it made no/very little difference at some performance cost. Now that you mention it I'm not sure if this made it into the article. Regardless, you can try generating at each iteration and see what happens.

Regarding the second point, the numerical values were indeed only evaluated for the image displayed in the article (the shepp-logan phantom). Notably, this is not a "random sample" from the prior but a rather special case. I picked it to show that the method can generalize quite well. This should be covered in the article.

alisiahkoohi commented 6 years ago

I ran learned_primal_dual.py for ellipses with generate_data() being called every iteration. I double checked to make sure I haven't changed anything else but the training loss blew up after ~5k steps. Will try the original code.

screenshot from 2018-07-18 11-33-02

adler-j commented 6 years ago

Very interesting. Long time since I ran this code, does it work well if you call it every 10:th iteration?

adler-j commented 6 years ago

Did you ever come through to running the original code? Did it work?

alisiahkoohi commented 6 years ago

Sorry for the late response. I should mention that I was trying to run the code on CPU so maybe that is why I see this issue. If that is the case it may also explain why the original code didn't work for me.

I haven't got a chance to install ODL extensions to run it on GPU yet.

ChengV0 commented 5 years ago

你有没有通过运行原始代码?它有用吗?

Running source code does not get the results in the article,What is the cause? I hope you can give me some advice.

ChengV0 commented 5 years ago

Did you ever come through to running the original code? Did it work?

image

I'm sorry to have some problems uploading pictures. This is my result. I look forward to your reply. Thank you.

adler-j commented 5 years ago

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Also, what version of ODL are you using?

ChengV0 commented 5 years ago

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Thank you for your reply.I running the learned_primal_dual.py and.learned_chambolle_pock.py The screenshot is the result of running the latter, and I stopped training because it collapsed at about 3000 steps. I'm sorry I didn't save the screenshots of loss and psnr.

adler-j commented 5 years ago

Sadly a tensorflow bug that hasn't been fixed in half a year (https://github.com/tensorflow/tensorflow/issues/16864) is causing this code to run extremely slowly on my machine, so it's hard for me to debug.

Are you running on master?

What version of ODL are you using?

ChengV0 commented 5 years ago

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Also, what version of ODL are you using?

Which file are you running to get those results? I've tried the current master on my machine and I get reasonable results. I also need to know e.g. how many iterations you ran.

Thank you for your reply.I running the learned_primal_dual.py and.learned_chambolle_pock.py The screenshot is the result of running the latter, and I stopped training because it collapsed at about 3000 steps. I'm sorry I didn't save the screenshots of loss and psnr.

image This is the result of my operation about learned_primal_dual.py. Seen from the chart, the result is not up to expectation. Is it due to inadequate training?

ChengV0 commented 5 years ago

Sadly a tensorflow bug that hasn't been fixed in half a year (tensorflow/tensorflow#16864) is causing this code to run extremely slowly on my machine, so it's hard for me to debug.

Are you running on master?

What version of ODL are you using?

yes ,odl-1.0.0.dev0

adler-j commented 5 years ago

Are you getting the same problem with learned_primal_dual.py? E.g. can you show a loss curve

ChengV0 commented 5 years ago

Are you getting the same problem with learned_primal_dual.py? E.g. can you show a loss curve

image I think so.There is no tendency for the results to improve.

adler-j commented 5 years ago

What implementation of the raytransform are you using?

E.g.

>>> operator.impl
'astra_cuda'
ChengV0 commented 5 years ago

image I just use the original code.

adler-j commented 5 years ago

I would very strongly recommend you install astra, try

conda install -c astra-toolbox astra-toolbox
adler-j commented 5 years ago

My learning curves look like this:

image

E.g. seems to be improving just fine.

The above is for "learned_chambolle_pock.py"

ChengV0 commented 5 years ago

I would very strongly recommend you install astra, try

conda install -c astra-toolbox astra-toolbox

image Our results are different.

ChengV0 commented 5 years ago

My learning curves look like this:

image

E.g. seems to be improving just fine.

The above is for "learned_chambolle_pock.py"

Maybe you can send me the code you're running now. Let me have a look. Linux encountered problems in installation astra-toolbox

adler-j commented 5 years ago

What problem did you encounter? I'm literally just running the code in this repo.

ChengV0 commented 5 years ago

What problem did you encounter? I'm literally just running the code in this repo.

It frustrates me that I can't reproduce your results using source code.

ChengV0 commented 5 years ago

My learning curves look like this:

image

E.g. seems to be improving just fine.

The above is for "learned_chambolle_pock.py"

Why are there two curves on each graph, but mine has only one?

adler-j commented 5 years ago

training and testing losses (blue is train, orange is test)

ChengV0 commented 5 years ago

training and testing losses (blue is train, orange is test)

Oh, So the code you use is different from mine?

adler-j commented 5 years ago

The "learned_chambolle_pock.py" file includes both training and testing losses. See e.g.

https://github.com/adler-j/learned_primal_dual/blob/64901e8b585b23fcadae1b007e97fcc06b091f5e/ellipses/learned_chambolle_pock.py#L152-L154

ChengV0 commented 5 years ago

training and testing losses (blue is train, orange is test)

image image I got terrible results and I don't know why.(primal_dual.py)

ChengV0 commented 5 years ago

The "learned_chambolle_pock" file includes both training and testing losses. See e.g.

learned_primal_dual/ellipses/learned_chambolle_pock.py

Lines 152 to 154 in 64901e8

test_summary_writer = tf.summary.FileWriter(adler.tensorflow.util.default_tensorboard_dir(name) + '/test', sess.graph) train_summary_writer = tf.summary.FileWriter(adler.tensorflow.util.default_tensorboard_dir(name) + '/train')

Well, thank you

adler-j commented 5 years ago

It's very hard for me to debug remotely, but my best guess right now is that you need astra.

Except for that, make sure that you have downloaded the laster version of this repo and ODL (e.g. re-install them).

Finally, what TF version do you run?

ChengV0 commented 5 years ago

Ok, thank you very much for your patient answer. I will try again. TF:(1.8.0)

adler-j commented 5 years ago

It's great to get feedback. I try to make sure the code is runnable by everyone. please report any progress.

ChengV0 commented 5 years ago

It's great to get feedback. I try to make sure the code is runnable by everyone. please report any progress.

image image The result is so bad(primal_dual.py)If it's convenient for you, I'd like to see your training summaries/images.Thank you.

adler-j commented 5 years ago

Did you re-install this library and ODL as advised? Did you install ASTRA? If you do not follow my advice it's hard to help.

My GPU is currently quite busy, but my training curve is a rather smooth convergence towards ~37 PSNR, nothing like what you are seeing.

ChengV0 commented 5 years ago

Did you re-install this library and ODL as advised? Did you install ASTRA? If you do not follow my advice it's hard to help.

My GPU is currently quite busy, but my training curve is a rather smooth convergence towards ~37 PSNR, nothing like what you are seeing.

image Thank you for your reply.When I installed astra_toolbox, I met difficulties.Because the server did not install conda but used pip3. May I have a look at your training pictures?(summaries/images) In training, is your GPU-Util similar to mine?What is the reason for this?Thank you again.

adler-j commented 5 years ago

My curves look like this:

image

image

image

My GPU is way more busy than that, e.g. not 100% but far higher than 1%. I guess mine is busy because i have installed ASTRA, without it all the time is spent in scikit-image.

ChengV0 commented 5 years ago

My curves look like this:

image

image

image

My GPU is way more busy than that, e.g. not 100% but far higher than 1%. I guess mine is busy because i have installed ASTRA, without it all the time is spent in scikit-image.

Thank you very much. You have helped me a lot.

AceCoooool commented 5 years ago

I would very strongly recommend you install astra, try

conda install -c astra-toolbox astra-toolbox

I also meet similar curve as @ChengV0 meets when run ellipses/learned_primal.py and ellipses/learned_primal_dual.py

I think astra-toolbox is faster than skimage, however, it will not influence the results too much. For example: (addition: oh no!!! it will infulence the results!!!!! However, I did not know why? --- Using astra can get the results as author @adler-j 's learning curve. And using skimage get learning curve as ChengV0 meets. )

import astra
import numpy as np
from skimage import measure
import scipy.io

P = scipy.io.loadmat('phantom.mat')['phantom256']

# astra
vol_geom = astra.create_vol_geom(256, 256)
proj_geom = astra.create_proj_geom('parallel', 1.0, 384, np.linspace(0, np.pi, 180, False))

proj_id = astra.create_projector('cuda', proj_geom, vol_geom)
sinogram_id, sinogram = astra.create_sino(P, proj_id)
rec_id = astra.data2d.create('-vol', vol_geom)
cfg = astra.astra_dict('FBP_CUDA')
cfg['ReconstructionDataId'] = rec_id
cfg['ProjectionDataId'] = sinogram_id
cfg['option'] = {'FilterType': 'Ram-Lak'}
alg_id = astra.algorithm.create(cfg)
astra.algorithm.run(alg_id)
rec = astra.data2d.get(rec_id)

print("psnr: ", measure.compare_psnr(P.astype(np.float32), rec))

astra.algorithm.delete(alg_id)
astra.data2d.delete(rec_id)
astra.data2d.delete(sinogram_id)
astra.projector.delete(proj_id)

# skimage
from skimage.transform import radon, iradon

img = P.astype(np.float64)
theta = np.linspace(0., 180., 180, endpoint=False)
sinogram = radon(img, theta=theta, circle=True)
reconstruction_fbp = iradon(sinogram, theta=theta, circle=True)

print("psnr: ", measure.compare_psnr(img, reconstruction_fbp))

psnr of astra: 34.125 psnr of skimage: 34.116

AceCoooool commented 5 years ago

@adler-j I have some questions(not open new issue for convience):

  1. the default geometry (odl.tomo.parallel_beam_geometry) using proj_space (geometry.det_partition.cell_sides) not equal to one. Is this better ? Due to many implementation (e.g. sklearn radon function) using 1.
  2. The projection's derivation is back projection. However, many library (e.g. astra-toolbox and skimage radon) implement these not in full accord .

Thank you @adler-j !