Closed abureau closed 3 years ago
Hi, this error is within my code rather than anything having to do with Tensorflow. Did you tweak the parameters JSON file at all? I'm unable to reproduce the error. Also, I recommend starting with the parameters file here https://github.com/calico/basenji/blob/master/manuscripts/akita/params.json rather than the modified tutorial version.
I started from params.json and changed the number of units to 1 because I have only 1 dataset. I also turned off batch norm because I was getting another error. Here is what I did: params_file = "/home/bureau/projects/def-bureau/basenji/manuscripts/akita/params.json" with open(params_file) as params_open: params = json.load(params_open) params_model = params['model'] params_model['batch_norm'] = False params_model['head_hic'][-1]['units'] =1 params['model'] = params_model with open('/home/bureau/projects/def-bureau/bureau/distiller/iPSC/data/1m/params_tutorial.json','w') as params_tutorial_file: json.dump(params,params_tutorial_file)
Hi, I'm still unable to reproduce the error. The code is having trouble parsing one of the computation blocks defined in the JSON file. Could you add a pdb.set_trace() here https://github.com/calico/basenji/blob/master/basenji/seqnn.py#L52 and poke around? Specifically, check what block_params contains.
OK, I will try that.
Can you tell me which version of Tensor Flow you are using? I have solved other similar errors in basenji by changing Tensor Flow version.
Sure, I'm using the latest 2.4
I installed TensorFlow 2.4 and I don't get this error anymore (I get another one I will describe in another issue). I am not sure the version change solved the issue, because I don't get it anymore with TensorFlow 2.3 either. Maybe I updated something else I did not realize? Anyway, issue closed.
Well, switching from the tf2_hic to the master branch brought back this problem. Here is what what block_params contains when it stops: (Pdb) block_params {'filters': 96, 'kernel_size': 11, 'pool_size': 2} (Pdb) block_params['name'] *** KeyError: 'name'
That seems to be describing the first convolution block. The 'name' key is specified here: https://github.com/calico/basenji/blob/master/manuscripts/akita/params.json#L23. Can you verify that your parameters file has the 'name' key present?
Indeed, my modifications to the parameter file removed the 'name' keys. I thought I modified the parameter file the same way as in the Akita training tutorial, but that was not the case. When I changed the parameter file exactly as in the tutorial, the 'name' key remains, and this error no longer happens. Thanks for pointing me to the problem.
I found the reason I get this error (that's why I reopen this issue under a different name): SeqNN deletes parameter block names (line 55). So when I call
seqnn_model = seqnn.SeqNN(params_model)
as in cell 2 of https://nbviewer.jupyter.org/github/gfudenberg/basenji/blob/tf2_hic/manuscripts/akita/explore_model.ipynb, then changes some parameters in params_model, and call
seqnn_model = seqnn.SeqNN(params_model)
again (or save the parameters in a file and call akita_train.py as in the first comment on this issue), I get this error because block names have been deleted.
Why does SeqNN need to delete the block names? Could it not delete them to avoid this annoyance?
SeqNN doesn't need to delete the block names. I pushed a commit to master that avoids doing it. Give that a try
Yes, your commit prevents the deletion of block names in params_model. Thank you.
I am curious however why you still want to delete the name from block_args. Let say the name is 'final' and you don't delete it, then I noticed that
block_func = blocks.keras_func['Dense']
current = block_func(**block_args)(current)
returns a function with name 'final' which makes sense to me. I did notice that the final layer is actually created by the alternative call
block_func = blocks.name_func[block_name]
current = block_func(current, **block_args)
which calls the function final in blocks.py, but returns a layer named "dense". That's beyond this issue, it would make more sense to me to name the layer 'final'.
I delete the 'name' argument because the actual blocks don't use it; it's only used for routing to the proper block. I'm not sure that I understand the rest of your question. Where are you seeing these names? Are you talking about Tensorflow name_scope (https://www.tensorflow.org/api_docs/python/tf/name_scope)? That's completely independent of this, and I'm not attempting to set names at that level here.
I was referring to the first column that appears in https://github.com/calico/basenji/blob/master/manuscripts/akita/keras_print.txt. For the final layer you see
dense (Dense) (None, 99681, 5) 245 upper_tri[0][0]
The name "dense" may have to do with name_scope, but I am not sure to understand.
When you don't delete the 'name' argument, calling
block_func = blocks.keras_func['Dense'] current = block_func(**block_args)(current)
displays
final (Dense) (None, 99681, 5) 245 upper_tri[0][0]
If that doesn't matter, then just close this issue.
Yea I don't think this matters. I recently added a "final" block (https://github.com/calico/basenji/blob/master/basenji/blocks.py#L699) to handle common final layer options, but that print output reflects the deprecated "dense" block below it. I'm guessing this is related to that
Well, seqnn.SeqNN still creates a block called "dense" with the new code for the "final" block. Anyway, I understand it is only cosmetic.
I am trying to run Akita and get the following error message. I have TensorFlow version 2.3. I am wondering whether Akita has been tested with that version, and if not, which version do you recommend?
akita_train.py -o /home/bureau/projects/def-bureau/bureau/distiller/iPSC/data/1m/train_out/ /home/bureau/projects/def-bureau/bureau/distiller/iPSC/data/1m/params_tutorial.json /home/bureau/projects/def-bureau/bureau/distiller/iPSC/data/1m/
/home/bureau/projects/def-bureau/bureau/distiller/iPSC/data/1m//tfrecords/train-.tfr has 7617 sequences with 1/1 targets /home/bureau/projects/def-bureau/bureau/distiller/iPSC/data/1m//tfrecords/valid-.tfr has 6676 sequences with 1/1 targets Traceback (most recent call last): File "/home/bureau/projects/def-bureau/basenji/bin/akita_train.py", line 151, in
main()
File "/home/bureau/projects/def-bureau/basenji/bin/akita_train.py", line 107, in main
seqnn_model = seqnn.SeqNN(params_model)
File "/lustre03/project/6000443/basenji/basenji/seqnn.py", line 35, in init
self.build_model()
File "/lustre03/project/6000443/basenji/basenji/seqnn.py", line 103, in build_model
current = self.build_block(current, block_params)
File "/lustre03/project/6000443/basenji/basenji/seqnn.py", line 54, in build_block
block_name = block_params['name']
KeyError: 'name'