zaccharieramzi / fastmri-reproducible-benchmark

Try several methods for MRI reconstruction on the fastmri dataset. Home to the XPDNet, runner-up of the 2020 fastMRI challenge.
https://fastmri.org/leaderboards
MIT License
151 stars 50 forks source link

Error in PDNet running upon compiling #120

Closed GeraldGore closed 3 years ago

GeraldGore commented 3 years ago

Hi,

I came here from Public LeaderBoard of FastMRI and I don't have a lot of experience in Tensorflow. While running PDNet, I am facing attribute error in precision_policy.loss_scale is None: AttributeError: 'Policy' object has no attribute 'loss_scale'.

Have you got this error, too? Furthermore,Have you reached that challenge result with these codes?

zaccharieramzi commented 3 years ago

Hi !

Can you show me the code snippet you used to run the PDNet?

I think this error rings a bell, but I would need to see what you did exactly to be sure. I did reach the challenge result using this repository indeed, but not the PDNet, and rather the XPDNet.

zaccharieramzi commented 3 years ago

I guess the error comes from this line, where we try to determine whether we are using mixed precision in order to not use norm clipping (I think I saw at some point it wasn't compatible but can't track down a specific issue).

I currently don't have the error you are mentioning even re-running the PDNet. I guess there is a version problem: which version of TensorFlow are you using?

It should work for TensorFlow 2.2 and 2.3.

GeraldGore commented 3 years ago

Hello,

Sorry, I saw your message late. Yes, it is tensorflow error. But Changing the compile function according to Tensorflow 2.2.0 gives another errors about attributes.

My tensorflow version is 2.2.0.

zaccharieramzi commented 3 years ago

I am not sure I understand what you are saying. Can you show me what code you ran to have the error? Also can you paste the entire stack trace?

GeraldGore commented 3 years ago

Hello,

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-16-0fe99a9645e2> in <module>()
----> 1 get_ipython().run_cell_magic('time', '', '\nfor net_params in all_net_params:\n    save_figure_for_params(**net_params)\n    ')

6 frames
/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2115             magic_arg_s = self.var_expand(line, stack_depth)
   2116             with self.builtin_trap:
-> 2117                 result = fn(magic_arg_s, cell)
   2118             return result
   2119 

<decorator-gen-60> in time(self, line, cell, local_ns)

/usr/local/lib/python3.6/dist-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    186     # but it's overkill for just that one bit of state.
    187     def magic_deco(arg):
--> 188         call = lambda f, *a, **k: f(*a, **k)
    189 
    190         if callable(arg):

/usr/local/lib/python3.6/dist-packages/IPython/core/magics/execution.py in time(self, line, cell, local_ns)
   1191         else:
   1192             st = clock2()
-> 1193             exec(code, glob, local_ns)
   1194             end = clock2()
   1195             out = None

<timed exec> in <module>()

<ipython-input-15-e4cc7e338f3a> in save_figure_for_params(reco_function, test_gen, name, **net_params)
     30 def save_figure_for_params(reco_function=None, test_gen=None, name=None, **net_params):
     31 
---> 32     model = unpack_model(**net_params)
     33     for image_index in range((len(test_gen_scaled))):
     34         im_recos= reco_function(*test_gen[image_index], model)

<ipython-input-15-e4cc7e338f3a> in unpack_model(init_function, run_params, run_id, epoch, **dummy_kwargs)
     20 
     21 def unpack_model(init_function=None, run_params=None, run_id=None, epoch=300, **dummy_kwargs):
---> 22     model = init_function ( **run_params )
     23     chkpt_path =  f'/content/drive/My Drive/fastmri_master/checkpoints/{run_id}-{epoch}.hdf5'
     24     model.load_weights ( chkpt_path )

/content/drive/My Drive/fastmri_master/fastmri_recon/models/functional_models/pdnet.py in pdnet(input_size, n_filters, lr, n_primal, n_dual, n_iter, primal_only, fastmri, activation)
     85         image_res = Lambda(tf.math.abs)(image_res)
     86     model = Model(inputs=[kspace_input, mask], outputs=image_res)
---> 87     default_model_compile(model, lr)
     88 
     89 

/content/drive/My Drive/fastmri_master/fastmri_recon/models/training/compile.py in default_model_compile(model, lr, loss)
     12     opt_kwargs = {}
     13     precision_policy = mixed_precision.global_policy()
---> 14     if precision_policy.loss_scale is None:
     15         opt_kwargs['clipnorm'] = 1.
     16     if loss == 'compound_mssim':

AttributeError: 'Policy' object has no attribute 'loss_scale'

I have changed mixed_precision.global_policy as mixed_precision.experimental.Policy() in compile function. But I have faced another problem.

Best,

zaccharieramzi commented 3 years ago

Hi,

You can find here a colab notebook where in TensorFlow v2.2, the loss_scale is available in the Policy object.

I think you are using a different TensorFlow version. Can you try to verify that using:

import tensorflow as tf

print(tf.__version__)

Please note that I have edited your comments and initial issue, in order to use the GitHub markdown code formatting. This allows an easier read of the code (you can even have color coding for python code). You can find some examples for code here.

zaccharieramzi commented 3 years ago

Hi @GeraldGore , do you have any updates on this?

GeraldGore commented 3 years ago

Hello,

Sorry for seeing late. There are 2 folder about fastmri-reproducible-benchmark. For the old version, I did not face that contains this error. But the latest folder, I have faced lots of times.

Best,

zaccharieramzi commented 3 years ago

Hi,

I am not sure which folders you are referring to. Maybe if you have different versions try pulling master in order to have the latest one git pull master.

GeraldGore commented 3 years ago

Hi, It works for old version. Thank you. Also I want to ask you a question. Does these networks have the transfer learning property? Can I use your checkpoint for this?

Best,

zaccharieramzi commented 3 years ago

Again, I really am not sure what you are refering to when you talk about "old version". Do you mean the old unet implementation? Or the functional models as opposed to the subclassed models?

In any case I am surprised you got the mixed_precision error. But I am happy that you managed to get over it, and therefore I am closing this issue.

Regarding your next question, can you please open a new issue for this? This way we can keep things tidy.