Closed rsankar9 closed 2 years ago
I will handle the edition of this submission.
@MehdiKhamassi @degoldschmidt would you be interested to review this submission?
@benoit-girard Thank you for the invitation! Unfortunately I am currently too busy. So I have to decline. Best wishes
@benoit-girard if you don't mind, I'd be happy to jump in reviewing this submission. The interesction of recurrent networks and reward-modulated/reinforcement learning is matching my interests and fluidity with Python (/Matlab) is given as well.
@MehdiKhamassi : OK, no problem. @schmidDan : Thanks for proposing, as there is no obvious conflict of interest, and as you seem to have the right expertise, you're welcome as reviewer 1!
@piero-le-fou @ChristophMetzner @stephanmg @junpenglao @thmosqueiro @rgutzen Is one of you interested in reviewing this submission?
Sorry, I don't have time to do a review at the moment.
@benoit-girard What is the expected deadline for a review to be finished? Right now I've little time, but would still be interested.
@ChristophMetzner Thanks for answering! @stephanmg We have no general policy (so far) at ReScience C. Do you think you could do it within 3 weeks? i.e. aiming at the 15th of April.
@benoit-girard thanks for the kind words. I'm right now a bit swamped with other tasks. I could start reviewing the submission starting from 1st of May. (I guess this might be too late)
@stephanmg Well, if I can find another reviewer who can handle this submission earlier, I will rely on her. Otherwise, I will solicit you.
@stephanmg Well, if I can find another reviewer who can handle this submission earlier, I will rely on her. Otherwise, I will solicit you.
No worries, I'd be happy to do it, just the time concern right now. So let's see how it goes.
@benoit-girard, I'd also be happy to review this submission, and likely could get to it within the next week(s).
Perfect! Thanks a lot @rgutzen you are reviewer 2, then. @stephanmg we won't need your help (this time).
@benoit-girard @rsankar9 Here is my review. I hope you find it helpful, and I'm happy to discuss if any points are unclear.
Overall, this is a very clearly structured paper and replication. Especially the investigation of the model limitations is very valuable. It's easy to read and to understand. The reimplementation is understandable and runs out of the box. I have some comments, mainly concerning how the agreement of the reproduction/replication is evaluated and presented.
@rsankar9 I enjoyed reading and reviewing your submission. Manuscript and code are well written! I had some things to point out in both, code and manuscript. I'd consider them as minor improvements to the quality of the contribution, as the validity of your main claims are assured nevertheless. Please find my review below @rsankar9 @benoit-girard .
z_1
, z_2
from r
, and mentioning the numerical solver used to simulate the system of differential equations, so that the reader knows about the meaning of the weights updated by eq.1-3.W_1
and W_2
).README.md
:
Code/Python implementation/Reimplementation/README.md
. I would welcome it having them as a proper requirements.txt
(with versions enforced by "==") in the respective folders, as it would be the common way to tell a Python user which packages to install.x(t) = r(t)*cos(t)
with t
in 0 to 10^4 ms, while every generator method I checked in the implementation does x(theta) = r(theta)*cos(theta)
with theta
in 0 to 2pi. Maybe that would be worth a comment in the code or similar.Psi(x) = 0.025 * (10.0 * x)^0.25
is used for Task 3, while in the code (Matlab and Python) it is set to Psi(x) = 0.005 * (10.0 * x)^0.25
. Maybe it would be good to point that out.z
(\bar{z}) seems to be 10 according to the paper (p. 1455), while in the Python parameter files (e.g. "simulation_parameter_file_Task1_FORCE.json") tau_z: 1
.scipy.sparse.random
wouldn't be an alternative (with using the respective data_rvs
argument)?@rsankar9 : did you have the opportunity to update your submission based on the reviewer's comments? Moreover, as asked by @schmidDan do you have the authorization of the authors of the original paper to publish their matlab code here?
@schmidDan sorry for being so late to answer your questions:
Thank you for the very thorough and constructive reviews. I do apologise for my delayed response. I have made several changes to the manuscript, based on your reviews.
More specific questions have been addressed inline for each review. The code and manuscript has been updated in the repository.
@benoit-girard : I do have the permission of the authors to publish their MATLAB code here. However, I do not have the permission to use the images from their paper (the trajeectory traces). I shall contact them regarding this and keep you posted.
@rgutzen : Thank you for your review! Please find below my responses to your comments (inline).
@benoit-girard @rsankar9 Here is my review. I hope you find it helpful, and I'm happy to discuss if any points are unclear.
Overall, this is a very clearly structured paper and replication. Especially the investigation of the model limitations is very valuable. It's easy to read and to understand. The reimplementation is understandable and runs out of the box. I have some comments, mainly concerning how the agreement of the reproduction/replication is evaluated and presented.
Replication/Reproduction
* It is great that you perform both a reproduction with the original code and a replication with your own implementation and separate clearly between them. Just in the discussion, their evaluation could be described more precisely. * The main issue I see in the submission, is the lack of explicit criteria used to judge how successful the reproduction and replication are. You use vague qualitative assessments such as "successfully closely", "function as presented in the paper", and "model does reproduce the results", however it is unclear how you arrive at this evaluation and what this entails in detail. What features of the trace deviation are relevant or just the average deviation from the target? Are some deviations from the target trace more or less worse than others? You should preferably aim for a quantitative evaluation of the similarity of traces that incorporates the relevant performance measures of the model, or if you provide a qualitative evaluation 'by eye', it is helpful to explain which features of the model behavior you focus on. In the figures, also a 'distance from target' measure as in the original paper is not shown, which would be already very helpful in that regard.
This is a very valid point and I understand the concern regarding the evaluation criteria. The replication was considered successful by visually inspecting the plot of the MSE of the trajectory, i.e. the deviation of the simulated trajectory from the target trajectory. To address this in a more quantitative manner, I have included two metrics in the manuscript now.
Hopefully, these two criteria should improve the quantitative assessment of the performance of the different implementations.
I've supplemented the vauge terms "succesful closely" with these two metrics, and substituted them with satisfactory/unsatisfactory wherever applicable.
I've included the MSE plots for the simulations, along with the mean deviation over the test phase. This should address if some anomalies in the test phase are better or worse than others.
Shouldn't the traces in the paper be reproduced exactly by using the same Matlab code and seed?
This is a valid point which we had also initially expected. However, this is not the case due to the differences between the random number generators in MATLAB and Python. While we were able to replicate the exact same behaviour for most MATLAB functions, we were unable to do so for one particular function, ``sprandn'' (also mentioned in the paper). This results in the initialisation of the J matrix being different and hence, in the differences in the final trajectories. To address this query better, I've included the results of one simulation, where I've plugged the J matrix produced in MATLAB into the Python adaptation, with the same random seed. You will see that in doing so, we obtain the exact same results.
* Using such a more quantitative evaluation the comparison of paper, Matlab, and Python version (+ modified) could be more differentiated. For example, from the figures alone the Python version seems to have more variation (be less stable/ more chaotic) than the Matlab version, nearly throughout the tasks (especially 1&2). This is however not addressed or explained.
* Another point, that is mentioned but apparently not really factored into the evaluation of the model is its dependency on the initial seed. A model that deviates from its intended behavior in 50% of cases just because of the choice of seeds seems jarring, especially as it is not reported in the original paper. It lets the reader wonder to what degree the original results were accidental and cherry-picked. To better grasp the influence of the seed and robustness of the model I'm missing the figures of the runs with different seeds in the paper (e.g. as supplements) and in the best case a quantification of the variation due to the chosen seed.
Text
* It would be helpful for the reader to state the level of reproduction and replication already early in the intro. Something in the realm of 'successful with minor/moderate limitations and modifications' * missing an explanation of 'e' in equation 2 * "verify performance of the model". This formulation is somewhat confusing to me. Evaluating a model's output against another would rather be a validation instead of a verification, and performance may also refer to the usage of computational resources instead of accuracy. I would suggest a reformulation explaining more precisely what you are doing e.g. evaluate the accuracy/robustness/behavior of the model / validate the model against its Python re-implementation / verify whether the original script produces the published results. * 2.1. l.7: "Post the training phase" When is training stopped? Is the training time fixed? What are the criteria? * Figure 4. is not referenced in the text * Where the agreement of the model outputs is described as "successfully closely" or similar, a more precise statement would be suited based on prior defined criteria (as described above). * Similarly, the discussion needs to be more differentiated and transparent in how you come to your conclusions e.g. "For Task 1 and 2, we can now confirm that the three algorithms function as presented in the paper. For Task 3, the SUPERTREX model’s behaviour is also reproducible, ..."
Typos/Grammar/Suggestions
* l.5 fail -> fails * l.6 utilise -> use * In the first sentence of the task descriptions, e.g. "Here, we compare the simulations of the author scripts and our re‐implementation for Task 1 using FORCE, RMHL and SUPERTREX, with ..." last comma is not needed. * to initialise -> initializing in "To do so, we run the simulations with the default seed and repeat it ten times with different seeds to initialise the random number generator." otherwise it may sound like you run the simulations to initialize the RNG. * demarcates -> marks the separation of * in 3. Modification: use numbered bullets instead of "* One, ..." * in 4. Discussion, l.3: out -> our
Figures
* This might not be technically possible, but it would help with the visual inspection of the butterfly traces if all three instances (Matlab, original, Python) would have more similar linewidths and colors (and same butterfly height/width ratio).
* As already mentioned above, the figures would benefit from also including a distance-from-target trace, similar to the MSE in figure 5.
* Why don't you also show the time series from the paper for comparison?
* Fig. 2 caption: "shows the actual output". Aren't both rows the actual output just in different representations? Maybe reformulate as 'time series.
* Fig. 3, a) & c): missing time series. Is there a particular reason for that?
* Fig. 5: Scale of MSE is very hard to read.
* And again as a reader I would be very interested to also see the runs with other seeds or some aggregate evaluation of them.
Code
* Regarding the Code and the repository, there is very little to object from my side. It is well structured, documented, easy to reuse, and reproduces all figures. * I wouldn't mind having a requirements.txt or environment.yaml file instead of the specification in the Readme.
Included a requirements file.
* The ten arbitrary seeds for the RNG are chosen separately for the different models, so the comparisons are both between models and seeds when they are not fixed (which can have a major influence, as you showed). This, however, doesn't become clear from the description in the paper.
* Just for curiosity, why did you choose a custom Python implementation of the model, what kept you from using a simulator engine like Brian, Nest, or even PyTorch?
@schmidDan: Thank you for your review! Please find below my responses to your comments (inline).
@rsankar9 I enjoyed reading and reviewing your submission. Manuscript and code are well written! I had some things to point out in both, code and manuscript. I'd consider them as minor improvements to the quality of the contribution, as the validity of your main claims are assured nevertheless. Please find my review below @rsankar9 @benoit-girard .
Review Summary
* Full replication | Partial replication | Failed replication | Reproduction: I believe it's a "Partial replication", but I have two things to note here @benoit-girard: * The authors mention in their manuscript's section "2 Comparison with Python Re-implementation" that they had access to the Matlab code of the original publication's authors. This essentially introduces a bias to the replication, which became evident by that the authors made modifications to their Python code based on how the Matlab code differs from methods described in the original publication. So I believe without the Matlab code at hand it would've been a "Failed replication" due to the seemingly missing information in the original publication. The authors consider their partial replication being a "Re-implementation", which sort of reflects this circumstance and very well point out the differences they discovered between the original paper and the Matlab code. I would consider it therefore a successful partial replication, but wanted to point out this circumstance nevertheless. * Is there a way to denote the difference in replications between full and partial in the article's tag (i.e. [Re] vs. [Re\] or something)? As so far the manucsript is tagged as a "Replication" besides replicating not all experiments (i.e. "Partial replication").
I agree with this assessment. It's a partial replication due to two reasons.
* Licensing: BSD-2-clause license for both Matlab code (by Pyle and Roberts) and Python code (by the contributors of this submission) * Reproducibility of the replication: I was able to run the Python code without any problems. The Matlab code did not work under GNU Octave 6.2.0. As Matlab is closed-source I'm not sure what the directive would be here either way @benoit-girard? * Clarity of the code: Overall well structured and documented code. * Clarity and completeness of the accompanying article: Concise and easy to read, captures the main claims and contributions of the original publication and points out robustness issues encountered in certain experiments as well as a possible solution. Some points could be alittle bit clearer (see my comments below).
Remarks w.r.t. the manuscript
* Layout: reading through the references is a little bit complicated, since LaTeX obviously cluttered some graphics inbetween the list of references (Figs. 5, 6). The figures should be placed before the "References" section starts. * References: Reference 1 is missing the journal name and is formatted differently than refs 2-6 (title in bold instead of journal, no quotation marks around title etc.). Some journal names are in title case others in sentence case.
Fixed.
* Framework: * "This allows for partially unsupervised learning" (p.2) - what is meant by "partially unspervised learning"? * The article should make for a self-contained read without reiterating every detail from the original publication. For my taste, this would include stating the complete model equations, i.e. reservior equations, computation of `z_1`, `z_2` from `r`, and mentioning the numerical solver used to simulate the system of differential equations, so that the reader knows about the meaning of the weights updated by eq.1-3. * While FORCE and RMHL acronyms are epxplained, this was omitted for SUPERTREX (p.2).
* Task: * "each tasks uses [...] multi-segmented arm" (p.2) - some tasks are based solely on the pen's position. Admittedly, one could view that as a 0-segment arm condition, though. * Concise language: "A non-linear inverse transformation would be required" (p.2) - transformation of what? having read the original work this is clear to me, but might not be clear for someone who doesn't know/forgot about the details in the original work. * Reading this section as well as "Task 1" and "Task 2" it might seem to the reader as if RMHL and SUPERTREX are learned with non-scalar error values for Task 1, as for Task 1 it is solely noted that "known target" output is used for training while for Task 2 it is explicitely stated that the error is a scalar. Indeed, for Task 1 the error (to the reward-modulated learner) is as well of scalar type. Critically, though the distinction comes from how the scalar is computed: output and target domain match in Task 1 - distance of the pen's position is indeed the target value of the output - the error couples linearly to the correct solution's output values -, while for Task 2 and 3 the output domain is angular values, but the target domain is the end effector's position - the error is non-linearly coupled to the correct solution's output values.
* Task 1,2,3: * The explanation should note, that during testing the feedback is provided via teacher forcing (cf. the original publication), since it was shown to be of quite important impact for the model performance.
* Task 3: * "We compare the performance of the MATLAB scripts and our Python adaptation for the three algorithms on Task 2, with the results presented in the article." - This sentence isn't clear to me. I believe it should say "Task 3" instead of "Task 2" here? Also describing your Python (re-)implementation as an "adaptation" might be a little bit confusing in the light that you later on indeed modify (i.e. adapt) your Python implementation to increase robustness. * I don't believe the text accurately reflects the degree to which the results are in line with the original publication here. While the later sentences acknowledge a deviation from the target contour, the summarizing introduction ("The MATLAB scripts and the Python re‐implementation are able to successfully reproduce the results presented in the paper with the default seed as well as with the 10 arbitrary seeds for the RMHL algorithm.") raises the impression, that the results in the orginial publication look about the same. Indeed, they don't, which is also why you changed the seeds (which nevertheless do not make up for the complete discrepancy w.r.t. the original publication). See also my comments w.r.t. "Results" in "Remarks w.r.t. the implementation".
* Modification: * Figure 4 is never referenced, but I feel like it should be noted somewhere within this section? * "We observe that RMHL performance is comparable to the original Task 2, [...]" - this is nowhere shown, right? So maybe add a "(not shown)". * The "two minor alterations" you stated refer to Tasks 1-3 throughout the section. To me it is not entirely clear whether the results of the privous sections have been generated with or without these alterations. From the structure of the manuscript and the labelling of the figures I would expect that the alterations mentioned are only applied for experiments conducted within "3 Modification". But, on the other hand, why would one state then how these alterations relate to Tasks 1-3? * It is great to have the two alterations stated as a bullet point list. What might be a little confusing is the reiteration of the two points (with an added interpretation of them) in the subsequent paragraph as by using "also" in "We also increase the error threshold [...]" it seems, on a first look, like here comes an additional alteration on top of the ones from the bullet point list, while it is indeed the second item of the list. * "[...] the model is able to perform well with up to 50 arm segments (Figure 5)." - Again, I believe "well" is a little bit optimistic for some of the results. For e.g. 10 arm segments (Fig. 5) I wouldn't use the word "well", while for e.g. 6 arm segments this seems to be justified.
* Figures: * No indication for the temporal extent of the plots is given (x-axis) (cf. original publication).
I have added this information in the caption now.
* I believe the "Original" butterfly plots are taken from the original publication. If so, this needs to be stated (and potential permission issues would have to be clarified with the holder of their copyrights).
It's true. I will contact the authors and keep you posted. Meanwhile, I have added a footnote indicating these plots have been re-used from the original paper.
* The "Original" butterfly plot looks squeezed w.r.t. the horizontal extent. Is this just a visual feat, or as well corresponding to differencees in the underlying implementation?
I use the aspect ratio of 1:1. Also, the scripts of the authors seem to use the aspect ratio 1:1. So, the plots, perhaps, from the paper look squeezed due to the way the journal prints it.
* For viewing the document digitally, it might be beneficial to use vector graphic plots instead of pixelated information. Note the difference when zooming in into the figures. Just in case this is easily realizable and doesn't require too much effort.
All the figures are in "eps" format now.
* The different plots within a figure aren't drawn to the same scale (not the difference in "MATLAB" and "Python" red butterfly outlines, or the vertical scaling of e.g. x and y axes), which makes it harder to visually judge relative perofrmance. Same goes for the horizontal scaling of temporal axes where weight changes are depcited (i.e. the gray vertical bars don't align with them of the angular evolutions).
I have now aligned the scales for corresponding plots.
* No plots for the evolution of the performance metric (distance from target) and weights (norm of the weight matrix) are shown. This would aid the comparison with the original publication and, I believe, would make the case for a successful replication even stronger.
I have added the MSE subfigures to all tasks, and the W_norm plot to all tasks except task 3, due to space/legibility constraints.
* Some plots (in Figs. 4,5) have scales for their vertical axis plotted, which is great. Unfortunately, the ticks of these scales are too small and squeezed to be interpretable.
Improved.
* For some of the temporal plots a slighlty increased linewidth would helpful.
Fixed for all plots.
* Up to taste: Using the same color tone (same blue) for "MATLAB" and "Python" temporal evolution plots would be pleasant.
Fixed for all plots.
* Figure 1: It seems like "MATLAB" and "Original" butterfly plots are consisiting of one period of drawing the shape (blue line), while "Python" seems to have multiple periods visualized. Is this the case, or are the different periods just aligning perfectly?
* Figure 2: * For the plots from the original article there seems to be some vertical gray artifact line to the left of each plot (it vanishes and appears w.r.t. my reader's zoom factor).
I was unable to reproduce this. I have re-uploaded new figures, in any case. Could you let me know if the issue persists?
* The caption seems to have a copy-paste-error: ""The second row shows [...] (x and y coordinates, in this case) [...]" - actually, the angles are depicted (according to the vertical axes' tag).
Fixed.
* Figure 3: * For the plots from the original article there seems to be some vertical gray artifact line to the left of each plot (it vanishes and appears w.r.t. my reader's zoom factor).
I was unable to reproduce this. I have re-uploaded new figures, in any case. Could you let me know if the issue persists?
* Suplot "(a)"/"(c)": "The target time-series is imitated well by the model during the training phase [...]" - this information is not shown and should be denoted as such (i.e. "(not shown)").
Fixed.
* Subplot "(b)": The y-axes of the plots of evolutions of the angles are somehow having an additional, squashed tag for each angle on to their left. These are probably due to how the plots have been generated.
Fixed.
* Subplot "(c)": I'm not sure whether "[...] with slight divergences [...]" does the subplots do justice. Qualitatively, the results are still of butterfly shape, but quantitatively, I believe, the performance metric will be quite impacted during some phases of each period (not shown).
I've included a diversion metric now, both in the caption and, as an aggregate sumamry in a table. This should hopefully justify the statement.
* Figure 4: * The differently colored lines of the evolution of the matrices' norm isn't described (i.e. green = exploration, purple = mastery pathways / `W_1` and `W_2`).
Fixed.
* Discussion: * "Only two necessary details were missing, [...]" - if they have been provided by the original authors upon request, it would be worth to mention it. Or have they been found by trial-and-error?
* "[...] the inclusion of a crucial learning rate of 0.0005 [...]" - can you explain which learning rate this refers to? If I read the original publication's method section correctly, then they state a learning rate of k=0.5 for Tasks 1 and 2 and k=0.9 for Task 3 (p. 1457).
* "[...] and a compensatory factor of 0.5 within the update of the readout weights of the exploratory pathway [...]" - I'm not sure what this exactly means wihtout an equation. I believe it's just another mulitplicative factor to the update. If so, why wouldn't one state it as part of the learning rate then? I.e. "[...] the inclusion of a crucial learning rate of 0.0005 for Task 1 and 0.00025 for Tasks 2 and 3". [edit: ...reading your code as well I understood that you treat it as a separate factor, since it is encapsulated in a "compensation" method.]
* "For Task 3, the SUPERTREX model's behaviour is also reproducible, [...]" - question to clarify: reproducible or replicable?
* Smaller comments: * fully supervised vs. fully-supervised: inconsistent usage of a hyphen throughout the text. * Inconsistend usage of language: sometimes you're referring to the Python code as "re-implementation" sometimes as "adaptation". Or are these two referring to different parts of your code? If yes, which ones? * Discussion: "provided by the author" should probably be plural ("authors") or "by the corresponding author"? * Task 1: "The article claims that :" - a whitespace between "that" and ":". * Task 3: "The article claim that" - should be "claims", ":" missing at the end. * Discussion: "[...] with out modular and user-friendly Python replication." - should be "our". * Discussion: "[...] for task 3" - throughout the text "Task ..." was used as a proper noun, so it should be "Task 3" here as well.
Remarks w.r.t. the implementation
* I understood that the Matlab implementation was provided by the authors of the original publication. But did they as well agree to have it hosted on Github as well as being considered part of your submission to ReScience C? * `README.md`: * The file mentions "The Python re-implementation has been submitted to ReScience C", but if I understand correctly I need to review the Matlab part as well to check for whether the results in your manuscript hold. So, if the Matlab part has to be part of my review, isn't it then as well part of the ReScience C submission? And if so, as it is provided by the original authors, how does this relate to tagging this submission as Replication vs Reproduction (@benoit-girard)? * Requirements are stated in e.g. `Code/Python implementation/Reimplementation/README.md`. I would welcome it having them as a proper `requirements.txt` (with versions enforced by "==") in the respective folders, as it would be the common way to tell a Python user which packages to install.
* Comparison of code with original publication: * Butterfly generator functions: Comparing with the original publication (Sec. 4.5 Tasks) it seems like there is a discrepancy in what was described and what is implemented in Matlab (respectively re-implemented in Python). They say `x(t) = r(t)*cos(t)` with `t` in 0 to 10^4 ms, while every generator method I checked in the implementation does `x(theta) = r(theta)*cos(theta)` with `theta` in 0 to 2pi. Maybe that would be worth a comment in the code or similar. * For the psi function (Sec. 4.5 Tasks) it is written that `Psi(x) = 0.025 * (10.0 * x)^0.25` is used for Task 3, while in the code (Matlab and Python) it is set to `Psi(x) = 0.005 * (10.0 * x)^0.25`. Maybe it would be good to point that out. * The distance from target was said to be computed from "sqrt(\bar{MSE})" (p. 1455 of the original publication). Comparing with the Python code (e.g. ModelFORCE.py:272,379) the "sqrt" operation seems to be commented out(?). * Smoothing time constant for `z` (\bar{z}) seems to be 10 according to the paper (p. 1455), while in the Python parameter files (e.g. "simulation_parameter_file_Task1_FORCE.json") `tau_z: 1`. * Comments: At some part in the code there are comments like "can be removed" (e.g. ModelRMHL.py:165,174) or "Possibly wrong to use for" (ModelSUPERTREX.py:230) or unused variables like "unnec" in e.g. ModelSUPERTREX.py:117, or "smooth noise?" in Task2_ST_Seg2.py:134, which overwrites the result from Task2_ST_Seg2.py:133. Or "weird stuff in code" in Task.py:132. Are all/some of them are meant to make it into the published version of the code, or should they be removed beforhand?
x(t) = r(t)*sin(qt)
with t
in 0 to 10^4 ms, though. Would you agree?* Results: While the results are as described in the accompanying article, taking a look at all the results in the different "Results" folders, it seems like the Matlab implementation yields higher performance (lower MSE) comared to the Python implementation (e.g. comparing the "MSE.png" files for "FORCE_Task1" it seems like Matlab reaches as low as 10^-6, while python reaches just 10^-4). Do you have any idea why? * The way of initialization "J" (e.g. ModelFORCE.py:106-113) seems a little bit complicated. I'm just wondering whether [`scipy.sparse.random`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.random.html#scipy.sparse.random) wouldn't be an alternative (with using the respective `data_rvs` argument)?
This is primarily due to the teacher forcing used in Task 1. I was not using the same feedback in the Python scripts, which led to this discrepancy. I have fixed that now, and added the deviation metric, which should provide further insight into this.
It's true that it's convoluted. And technically, it is definitely not necessary. However, there's also no shortcoming besides readability. The only reason I implemented it this way is that I was attempting a close adaption, and this is the closest method I can find to the MATLAB implementation. It handles redundancy the same way as MATLAB, which is not the case with the SciPy function. Also, this way I'm able to monitor the calls to the random generator and align it better with the MATLAB code. So, if I replace just the initialisation of the J matrix with the one from the MATLAB simulation, the rest of the simulation progresses in a way that is identical to the MATLAB. But, ofcourse, it's definitely not necessary for the replication, and the scipy function can be used too.
Thanks a lot @rsankar9 for this revised version of the submitted paper. @schmidDan and @rgutzen are you satisfied with the new version of the paper and the answers of the authors, or do you require an additional round of review?
The revised article looks very good to me, and I'm satisfied with how all the review comments were addressed. Especially, the addition of the error traces is very helpful as well as the quantification of the results in the overview tables. For me, there is no additional review round required. I just noted down a few minor things for the revised version, whose fixing I'd consider however optional.
Fig. 1 caption: I don't quite understand "X‐axis scale:‐ 1s : 1 period"
In several figure captions you write "The horizontal grey line, in the test phase, indicates the deviation metric." which may be a bit ambiguous whether this metric is imposed or derived. To make this clearer you could write '... indicates the average MSE which is used as the deviation metric'.
page 8, bullets: missing whitespace 'n=11)and'; additional whitespace 'FORCE algorithm. The'
Fig. 5: It took me a bit to understand that there is nothing missing on the left Matlab side. Maybe, you'd like to add a note or indication to the plot that the simulation broke after x seconds, to make this clearer to the reader.
Thanks for the revision. I was occupied the last weeks. I will look at the updates by the end of this week, so that you can expect an answer by the beginning next week at the latest.
Thanks for your responses. @rgutzen I'll try to incorporate these comments, and upload the revised version by next week.
@rgutzen Thank you for your comments. I've made some minor changes in response to them.
@schmidDan did you have a look at the final vesion?
@benoit-girard @rsankar9 Thanks for your patience and appologies for the considerable delay (which was an unforseen one as well on my end). I'm back in the loop by now and pulled the latest changes. I should be done with having a look at the final version by tomorrow evening.
@rsankar9 Thanks for your responses and incorporation of my previous comments, as well as your patience. The article as well as implementation make for a great read now.
README.md
files. Overall, I believe, this qualifies the article and code at their current state for publication. From my side, no additional round of reviewing would be required. @benoit-girard Note that for variant of task 2, the SUPERTREX statistics have been computed using only 2 simulations.
. I believe this is true for the MATLAB
and Python adaptation
, but probably you used all available 11 samples for the Python re‐implementation
(and thus want to mention that)?
- I believe the "Original" butterfly plots are taken from the original publication. If so, this needs to be stated (and potential permission issues would have to be clarified with the holder of their copyrights).
- It's true. I will contact the authors and keep you posted. Meanwhile, I have added a footnote indicating these plots have been re-used from the original paper.
Thanks for adding the footnote. Just to make sure: Has there been an answer from the authors by now? (no action in the document required I guess)
- For the plots from the original article there seems to be some vertical gray artifact line to the left of each plot (it vanishes and appears w.r.t. my reader's zoom factor).
- I was unable to reproduce this. I have re-uploaded new figures, in any case. Could you let me know if the issue persists?
Seemingly the issue still persist for me, but I cannot tell why. Here is a screenshot from Fig. 3, b) at zoom factor 100% using Adobe Acrobat Reader on Windows 10 (although the issue is as well there for some of the other figures): link to the screenshot Still, if this is just an issue on my end, then there's no action required here.
In order to be able to run your code I needed to (would be nice to have those adapted):
J_mat
as the file var.mat
doesn't existjson==2.0.9
throws an error (at least on Ubuntu https://stackoverflow.com/questions/41466431/pip-install-json-fails-on-ubuntu) and pip install -r requirements.txt
breaks, but everything is still functional if commented out in the respective requirements.txt
filespandas
is missing in the respective requirements.txt
filesThanks to both the reviewers for your thorough and helpful comments. I've made some modifications to the manuscript. Hopefully it should address your concerns.
Review Summary
* The article as well as the implementation have been greatly improved since my last review. As well, all my points have been addressed satisfactorily. Additional (sub-)figures and clear metrics for judging the quality of simulation outcomes make for a great, understandable read. Given the specificities of the work and the code base provided by the original paper's authors, I appreciate that the article's authors tried to kept as many parallels as possible in their implementation w.r.t to the Matlab implementation - this will help potential users with cross-checking between the implementations and fosters understanding. Furthermore, the code is nicely formatted and documented with insightful `README.md` files. Overall, I believe, this qualifies the article and code at their current state for publication. From my side, no additional round of reviewing would be required. @benoit-girard * While reviewing article and code I nevertheless spotted some minor points, which I still want to point the authors to. In my view, these can be tackled without the need for another review. Please, view these changes as optional either way.
Remarks w.r.t. the manuscript
* In Table 1 you mention `Note that for variant of task 2, the SUPERTREX statistics have been computed using only 2 simulations.`. I believe this is true for the `MATLAB` and `Python adaptation`, but probably you used all available 11 samples for the `Python re‐implementation` (and thus want to mention that)?
You're right. For the re-implementation, it uses all available samples. I've updated the caption to make the distinction.
* Some axis labels seem a little bit warped (stretched horizontally or compressed vertically)
I see what you mean. This is caused due to difference in aspect ratio between the original figure (produced by the scripts) and the figure displayed in paper (scaled to fit by latex). This can be solved by changing the aspect ratio / figure size when plotting the original figure. If it's ok, I would choose not to fix this as it would require regenerating all the figures. Hopefully, the figures provided along with the manuscript, in its original aspect ratio, should be sufficient for readers who would like more details.
* It is nice to have the scale of the horizontal axis included in Fig. 1, bottom right. Probably the other figures would profit from a similar indication.
I've included it in the other figures now.
* Fig.4c subcaption could profit from mentioning which implementation has been used to produce the result.
The implementation used is the same as mentioned on top of the columns. I've added this information to the caption of Fig 4c to be more clear. I believe your doubt arises due to the difference in seeds. The reason for this is that since the default seed did not yeild the desirable result (as shown in Fig 4b), I've shown a sample from one of the 10 extra runs. These runs use randomly chosen seeds for the generator, and hence, do not match.
* Fig.7 caption last sentence: "Using the MATLAB scripts, the readout weights increase uncontrollably rendering the model unable to learn." I'm not completely sure here, but probably it is a left-over from another figure? Seemingly, it doesn't fit the context of Fig.7.
Good catch. Yes, it is a carry over mistake. Thanks for pointing out.
* Minor typos etc. I spotted during reading (optional to be adapted): * p.1: singular-plural "Most existing algorithms are built on fully supervised learning rules (e.g. FORCE [2]), which limits its potential" - should probably be "their potential" * p.5 "The mean deviation , for" - there's a whitespace in front of the comma * p.5 "[...] for both the original scripts (0.168±0.038; n=11)and" - there's a whitespace missing after the ")" * p.5 "Figure 1,2;" - there's a whitespace missing after the comma * p.12 "On the other hand, simulations of [...] seeds was able to" - should be plural: "[...] were able"
Fixed the minor typos.
* There are some overflow problems at the end of lines (optional to be adapted): * p.10 tables overflow onto the page's right margin and also some columns overflow (e.g. "Task" column in Table 1) * p.12 "11; Modified Python re‐implementation: 0.140 ± 0.071, n=11)." overflow onto the page's right margin * p.16 "Python re-implementation" overflow onto the page's right margin
Fixed the overflow issues.
* Answers to previous discussion points:
I believe the "Original" butterfly plots are taken from the original publication. If so, this needs to be stated (and potential permission issues would have to be clarified with the holder of their copyrights).
- It's true. I will contact the authors and keep you posted. Meanwhile, I have added a footnote indicating these plots have been re-used from the original paper.
Thanks for adding the footnote. Just to make sure: Has there been an answer from the authors by now? (no action in the document required I guess)
We do have a response from the authors. They agree to for both their codes and their figures to be used with the appropriate license (included in the repository for MATLAB scripts) and citation (included in the I've filled a request form (as per https://direct.mit.edu/journals/pages/rights-permissions) and am awaiting their response.
For the plots from the original article there seems to be some vertical gray artifact line to the left of each plot (it vanishes and appears w.r.t. my reader's zoom factor).
- I was unable to reproduce this. I have re-uploaded new figures, in any case. Could you let me know if the issue persists?
Seemingly the issue still persist for me, but I cannot tell why. Here is a screenshot from Fig. 3, b) at zoom factor 100% using Adobe Acrobat Reader on Windows 10 (although the issue is as well there for some of the other figures): link to the screenshot Still, if this is just an issue on my end, then there's no action required here.
I remain unable to reproduce this issue. However, I use a different OS and pdf viewer. If it doesn't disrupt the understability of the document, I reckon we let it be. I'm open to solutions, if someone else is able to reproduce this error.
Remarks w.r.t. the implementation
In order to be able to run your code I needed to (would be nice to have those adapted):
* comment out in Python adaptation/ModelFORCE.py:101,116 usage of `J_mat` as the file `var.mat` doesn't exist * the requirement for `json==2.0.9` throws an error (at least on Ubuntu https://stackoverflow.com/questions/41466431/pip-install-json-fails-on-ubuntu) and `pip install -r requirements.txt` breaks, but everything is still functional if commented out in the respective `requirements.txt` files * the requirement `pandas` is missing in the respective `requirements.txt` files
The corresponding changes have been made.
I believe the "Original" butterfly plots are taken from the original publication. If so, this needs to be stated (and potential permission issues would have to be clarified with the holder of their copyrights).
- It's true. I will contact the authors and keep you posted. Meanwhile, I have added a footnote indicating these plots have been re-used from the original paper.
Thanks for adding the footnote. Just to make sure: Has there been an answer from the authors by now? (no action in the document required I guess)
We do have a response from the authors. They agree to for both their codes and their figures to be used with the appropriate license (included in the repository for MATLAB scripts) and citation (included in the I've filled a request form (as per https://direct.mit.edu/journals/pages/rights-permissions) and am awaiting their response.
Just a quick update regarding this. The journal has granted nonexclusive permission to reprint the figures and equations from “A Reservoir Computing Model of Reward Modulated Motor Learning and Automaticity” in this article, provided that the original Neural Computation article is cited appropriately.
@rsankar9 thanks for the very last updates on the paper. Based on the positive feedback provided by the reviewers, @rgutzen @schmidDan the paper is accepted for publication in ReScience C. I will start now the tedious editing process, I will soon ask you to recompile your paper with updated data.
@rgutzen @schmidDan : thanks for the time and expertise invested in this review process!
@rsankar9 : I updated the metadata.yaml file, I made a Pull Request on your repository for that. I need you to recompile your pdf with the following data: updated editor & reviewer information, and dates of acceptance and publication. After that I will finalize publication.
@benoit-girard Thank you. I have merged your PR. The PDF has the updated information. Feel free to let me know if there's anything else to be done.
Nothing more to be done on your side, I just have to solve my identification problem with github, and then it should be over...
Ok, I see that the process has been finished (by @rougier ?), and that the paper is officially published and referenced on the journal website. Congratulations!
Original article: A Reservoir Computing Model of Reward-Modulated Motor Learning and Automaticity
PDF URL: https://github.com/rsankar9/Reimplementation-SUPERTREX/blob/main/ReScience_submission/article.pdf Metadata URL: https://github.com/rsankar9/Reimplementation-SUPERTREX/blob/main/ReScience_submission/metadata.yaml Code URL: https://github.com/rsankar9/Reimplementation-SUPERTREX/tree/main/Code
Scientific domain: Computational Neuroscience Programming language: Python, MATLAB Suggested editor: