Closed NishantPrabhu closed 2 years ago
FYI this paper was submitted to ML Reproducibility Challenge, and the reviews can be viewed here: https://openreview.net/forum?id=4hm5ufX69jo . While the report was not accepted for the challenge, the authors probably wish their report to be considered for an independent evaluation for this journal. I hope the reviewers can take the existing reviews and the updated manuscript into consideration for the review. @rougier, since I was the Program Chair, reviewing this paper is a conflict of interest for me, and I pre-emptively recuse myself for the same!
@koustuvsinha Thanks for the explanation. I think this is the first time we have such case and I'm not sure how we should handle it. @khinsen @oliviaguest @benoit-girard Thought ?
@NishantPrabhu I think we can review your submission as a regular ReScience submission but you would need to adapt your manuscript. Since you already had some reviews, it might be also good to take these advices into account. But before that, please wait until we make a decision and have an editor.
@rougier
Greetings!
We intend our submission to be evaluated as a regular ReScience submission. The current submission has already been redesigned according to the template prescribed on the repository. If any other form of adaptation/modification is desired, please let us know and we will get it done as soon as possible.
We had gone through and responded to the reviews given during MLRC 2020, and our submission had been revised to concur with the suggestions in the best possible manner. Our submission here contains the revised content.
Please let us know if there's anything else we should attend to. Thanks!
@rougier I'd say we should treat this as a regular ReScience submission, and let the editor who will handle the review decide if and how the prior reviews (which correspond to an earlier version) can be integrated into the reviewing process.
I am happy to take this over as editor.
I also recommend @\nicksexton and @\don-tpanic as reviewers for this (when such time comes). โบ๏ธ
@oliviaguest Ok. Next question is whether we need one or two reviewers given that the article has been already reviewed.
I vote for 2 โ not only because I can get 2 easily, but it also seems fair.
So @rougier 2 is OK? Shall we all discuss this?
Sorry for the delay and 2 is fine for me.
๐ @nicksexton and @don-tpanic, would either or (even better) both of you be able to review this for me? โบ๏ธ
hello, we are on the case with this. Will keep you all posted. Is there a deadline?
@nicksexton perfect, thank you both! I'll give @don-tpanic a chance to also reply, so I can "assign" you both (GitHub thing that allows you to get notifications, etc). Deadline can be flexible, but if you can do this over the next few weeks, that would be amazing. Let me know your own schedules and we can work around that, but a month or so is totally fine. ๐
๐ @nicksexton and @don-tpanic, would either or (even better) both of you be able to review this for me? โบ๏ธ
Yep, I am on board too. Oh hi Olivia :)
Hi @NishantPrabhu
I ran into this error when running python3 main.py --config 'configs/learned_2d.yaml' --task 'train'
inside main/
Traceback (most recent call last):
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 414, in <module>
trainer = Trainer(args)
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 28, in __init__
self.encoder = networks.Encoder(self.config['encoder']).to(self.device)
AttributeError: module 'models.networks' has no attribute 'Encoder'
I suspect this is something minor like the attribute was renamed? Could you check this for me and update the code?
Thanks! Ken
Hi @don-tpanic
Thanks for pointing out this issue. As you rightly diagnosed, a class name had been modified. I've fixed it now and it's working fine at my end. Let me know if you face any other issues.
Thanks and with warm regards, Nishant.
Hi again @don-tpanic
If you're planning to train the model for a while, you might want to modify the batch size in the config files to suit your accelerator's capacity. I've set it to 1 for testing the changes I recently made.
Thanks, Nishant.
Thanks @NishantPrabhu for your prompt response!
Could you look into this error for me?
When I ran python3 main.py --config 'configs/resnet.yaml' --task 'train'
Traceback (most recent call last):
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 609, in <module>
trainer = Trainer(args)
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 38, in __init__
self.encoder = networks.Encoder(self.config['encoder']).to(self.device)
File "/home/ken/projects/Self-Attention-and-Convolutions/main/models/networks.py", line 109, in __init__
self.num_blocks = config['num_encoder_blocks']
KeyError: 'num_encoder_blocks'
Thanks, Ken
Hi @don-tpanic
The CLI command to train a ResNet model is slightly different. I understand the confusion since we haven't mentioned this detail in the README, which I have added now. Please check the Task options section in the README to know more.
In particular, to train a ResNet model, the correct command would be:
python3 main.py --config 'configs/resnet.yaml' --task 'resnet_train'
Note that --task
is given resnet_train
and not train
. It is so because our trainer definitions for the two are quite different.
Let me know in case of other issues. Thanks!
Nishant.
Hi @NishantPrabhu,
Thanks for updating the README. I was able to run training with resnet perfectly!
Here is another error I came across when training quadratic embedding with isotropic gaussian.
python3 main.py --config 'configs/gaussian.yaml' --task 'train'
The error was thrown at the end of validation:
[TRAIN] Epoch [ 1/300]
Progress: [===================>] 100.00% [Loss] 1.8334 [Accuracy] 0.3218
[VALID] Epoch [ 1/300]
Progress: [===================>] 100.00% [Loss] 1.5961 [Accuracy] 0.4208 Traceback (most recent call last):
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 610, in <module>
trainer.train()
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 438, in train
self.visualize_attention(epoch)
File "/home/ken/projects/Self-Attention-and-Convolutions/main/main.py", line 197, in visualize_attention
ax.imshow(img_normal.permute(1, 2, 0).cpu().numpy())
RuntimeError: number of dims don't match in permute
I believe the error was originated in visualize_attention
. One related question if you could clarify it for me: it seems attention will always be visualized no matter if the task is train
or viz
? I wonder if this is intended behaviour or could be sligtly improved?
Thanks, Ken
Hi @don-tpanic
First of all, I'm really sorry about all these roadblocks you're facing. Looks like we didn't handle some of our code correctly after we modified it post our RC2020 reviews.
Anyway, I have updated the code to improve consistency across experiments. When you use set --task
as viz
, the code will now look for imgs/
in the working directory and use one image from it for visualization. A folder with that name is already present in the experiment directories, with some handpicked images in it. Please note that for SAN and ViT, loading a pretrained model using --load
is necessary for visualization to work. For main
, it will work but will generate useless results.
Also, I've disabled visualization during training. We had done that to observe the evolution of attention patterns with training, which should not be necessary now. If needed in the future, just uncomment line 433 in main/main.py
. This part of the code for all three experiments is now working fine on my end.
Thanks and with warm regards, Nishant.
Greetings!
I hope you're doing fine. I just wanted to know if you require any further clarification or assistance regarding our submission. It would be great if we could have a brief update about how far the review process has come.
Thanks and with warm regards, Nishant.
Hi @nicksexton and @don-tpanic โ have either of you (or both) had a chance to keep going on this? Any updates welcome, and a rough (and flexible) ETA would be useful! โบ๏ธ
Hi @NishantPrabhu @oliviaguest, sorry for being a bit slow getting back to this as we got held up with trying to get a paper out. We have reproduced the results of 4 models so far and are expecting more to come. @nicksexton and I will touch base on this tomorrow and we will update you all with a timeline.
Thanks, Ken
@don-tpanic amazing! ๐ธ
Thanks for your patience, we have had a chance to confer on this. We have worked through running the models in table 1 and generally, everything checks out.
learned_2d.ymal
- use_attention_data
?), we have been able to make educated guesses about these, but for reproducibility this needs to be explicitly stated (ideally in a script, or in the readme)../temp/log_extract.ipynb
). Not that there is anything wrong with notebooks per se but we are specifically concerned about the need to manually configure model runs and name datasets (2, above) where specific filenames are hard-coded in the plotting notebook. As a minimum, we think there is a need for documented code that automates this part of the process.Thanks! Nick
Thanks for all this, you @nicksexton and @don-tpanic! โบ๏ธ
Two main points/takeaways from me:
a. @NishantPrabhu, you now probably have a lot of small and large changes to make to improve this replication. Feel free to deal with these while everybody else discusses b (below), if you want.
b. I am not sure about the questions above (in points 1 and 4). @khinsen @benoit-girard @rougier can you chime in too, please? I would like the author(s; @NishantPrabhu) to also give their perspective on this, when possible.
Hi @nicksexton Thank you for reviewing our reproducibility report.
Thoughts about 1. and 4. Thank you for validating and confirming our results in Table 1. As mentioned in 2., we will include separate configuration files for each run which should help automate reproducing all the results. Indeed some of these experiments are quite computationally heavy making it quite difficult to validate all our results. If it helps, we could provide our training logs from wandb as a check for our claimed results (which could be an alternative from having to run everything from your end).
PS: As mentioned in our report, we go beyond reproducibility and propose a new variant of the attention operation (to possibly overcome some of the problems observed during reproducibility). Since the discussion (so far) didn't mention it, we wanted to highlight it to the reviewers and would love to hear their thoughts regarding the same.
Thanks and Regards, Mukund and Nishant
Greetings!
I have added experiment specific configuration files to our repo and updated the README with more information about them. Please note that I have only included configurations of experiments we performed, even though more combinations are possible. Kindly let me know if anything else needs to be done.
Thanks and with warm regards, Nishant.
@oliviaguest Jumping in. Can you give us a bit of context?
@rougier from above "To summarize, the learned embedding-based experiments were run on 16GB NVIDIA V100 GPUs rented from AWS, and the others were run on an 8GB NVIDIA RTX 2060 owned by us." (As said by @MukundVarmaT.)
I assume this means the results can only be recreated using specific (types of) hardware. What do we think โย can this be overlooked or do the authors have to do something to make it "more" replicable? I assume no, but I also am not sure what this means for the rules/standards of the journal?
cc: @khinsen @benoit-girard
Oh I see. I think we talked about such case a long time ago without finding a definitive solution. Fro the present case, I think we need to trust the author of the replication (@NishantPrabhu) that his results are correct and the reviewers can only review article and read code to check it is consistent. But in the long term, we need to address the problem in a generic way. Not sure how though.
If so, then all we need to be done here is for @nicksexton and @don-tpanic to sign off on the changes once the authors confirm they are done making them โ OK, good. โบ๏ธ
Hi @NishantPrabhu,
I have managed to reproduce more results from the paper and things are looking really good so far. Here are some minor suggestions which I think could improve the ease of reproducibility for others in the future.
For the moment, in order to visualise attention over images, one has to trace back into the source code for where to supply the image file name. It would be nice if this bit is less hard-coded.
Again, minor suggestion but I think the command provided for visualising attention could be improved. The current command could potentially cause 2 problems when no --output
is provided by the user. i) as it defaults to a directory named based on the current timestamp, it might lead to confusion in that model checkpoints and associated results are located in different directories (took me a while to find the nice attention plots); ii) when using this command for visualising gaussians in Figure 2 (with --task 'gauss_viz'
), leaving --output
to default will actually be a problem because the code looks into this new directory based on timestamp that does not have any checkpoints.
Basically, I suggest either making this more explicit that both --output
and --load
need provided and consistent or it would be better when providing --load
is sufficient.
temp
.Best, Ken
Hi @don-tpanic,
Thanks for the suggestions!
--image
to the list of CLI arguments where you can specify the path to an image on which visualization operations will be performed. I've added information about this option to the README as well.--load
, the output directory will be overwritten with this argument. Thus, all visualization outputs will be stored in the same directory containing the trained model. I hope this is what was expected.I have tested the code after modifications (1) and (2), and it's working fine on my end. If you face any issues running the code on your end, please let me know.
Please also note the following changes:
--load
for visualization tasks in any of the experiments will now throw a NotImplementedError
. Earlier, it would generate outputs with a randomly initialized model. I found this more sensible as it will notify the user of this requirement instead of generating wrong results. Thanks and with warm regards, Nishant Prabhu
Hi everyone,
I've added a script plot_performance.py
to plot the train/validation metrics of desired experiments. I've added instructions to use the script in the README. Please let me know if it fulfills the required functionality, or if any issues are faced.
Thanks and with warm regards, Nishant Prabhu
@NishantPrabhu thank you! I will wait for the reviewers to confirm this is all OK now! Very exciting, as I think we're very close. ๐
Hi @NishantPrabhu,
Thanks for all the prompt updates. Here is another issue when I ran python3 main.py --config 'configs/vit_hier.yaml' --task 'train'
File "/home/ken/projects/Self-Attention-and-Convolutions/ViT/main.py", line 346, in <module>
trainer = Trainer(args)
File "/home/ken/projects/Self-Attention-and-Convolutions/ViT/main.py", line 82, in __init__
if os.path.exists(os.path.join(args['load'], 'best_model.ckpt')):
File "/home/ken/.conda/envs/rescience/lib/python3.9/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Can you look into it and make sure SAN/
doesn't have the same issue?
Thanks, Ken
Greetings!
Sorry about this bug; has been fixed now. Tested okay for SAN/
and ViT/
on my end.
Thanks, Nishant
Hi all,
Accidentally closed the issue; have reopened it. Sorry if this causes any issues!
Thanks, Nishant.
@NishantPrabhu Jumping in to ask for your opinion on a proposal for improving the ReScience C reviewing workflow for future HPC-style submissions. Unrelated to this review, I am not asking you to adopt this scheme retroactively! From your point of view, would such a process be a reasonable burden on authors of future submissions?
@khinsen First of all, sorry for the delay.
Most of the suggestions and issues put forward by the reviewers were genuine and necessary for our reproducibility effort to be complete. There is one suggestion I would like to make anyway.
More often than not, authors maintain detailed logs of experiments they run (on platforms like Tensorboard or WandB). The same can be made available for the reviewers to verify the results, removing the need for the reviewers to redo them on their end. On a case-by-case basis, some experiments can be looked at in more detail if the logs are not convincing enough. No extra effort needs to be put in by the authors as well as the reviewers. Apart from this, the current reviewing workflow seems very reasonable.
Thanks, Nishant.
Also, if I may ask, is there anything else we can help you with to complete the review? Would be great if an estimate of when the process will complete can be provided.
Thanks for your time. Nishant.
Given the reviewers are donating their time in a pandemic, I do not want to pressure them especially as I think they have done a fantastic job so far. When @don-tpanic and @nicksexton are happy to take this to acceptance, we can continue. What is left anyway?
Hi everyone, thanks for your patience.
@don-tpanic and I have finished reviewing all the simulations reported in the paper and we are happy to recommend it for publication. We really like the paper and feel it reports a replication clearly as well as providing novel insights. In general, everything we did in replicating it checked out - the code was clearly documented, mostly ran straight out of the box, and the authors have been very responsive in addressing the exceptions. All the simulations we have run match the results in the paper, except the simulations originally run on AWS that we didn't have the hardware for in the lab, as discussed above all the other results match the paper so we have high confidence that these also would replicate.
Nice work and congratulations!
We would like to thank the reviewers (@don-tpanic, @nicksexton) and the editor (@oliviaguest) for taking out the time and reviewing our submission. Your feedback has definitely helped improve our reproducibility report.
@oliviaguest, What would be the steps from here, and is there something required from our end to complete the process?
Sure, @MukundVarmaT you can make sure your metadata.yaml
file is fully filled in with what is required of you. I need it for the publication script: https://github.com/rescience/articles#15-steps-to-publishing-a-rescience-article. ๐
@oliviaguest Please find updated metadata.tex, article.pdf, metadata.yaml
I believe some of the information has to be filled in by you which will also reflect changes onto article.pdf.
@MukundVarmaT can you check if it's easy for you to fix the text and figure 1 going into the margins on page 4, please?
@MukundVarmaT I might be wrong but maybe all the figures are going into the margins... is this on purpose and I just never noticed it before, @rougier?
Original article: On the Relationship between Self-Attention and Convolutional Layers by Jean-Baptiste Cordonnier, Andreas Loukas & Martin Jaggi, in proceedings of ICML 2020.
PDF URL: Reproducibility report Metadata URL: Metadata Code URL: Code
Scientific domain: Deep Learning
Programming language: Python Suggested editor: Nicolas P. Rougier (@rougier)
Dear all, This is a reproduction of the paper mentioned above, written with Mukund Varma (@mukundvarmat). The paper discusses the use of transformers and self-attention for vision tasks. We suggest Nicolas P. Rougier as the editor.
With warm regards, Nishant Prabhu