Closed editorialbot closed 10 months ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Software report:
github.com/AlDanial/cloc v 1.88 T=0.04 s (355.8 files/s, 85784.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 8 278 598 1591
Markdown 4 167 0 368
TeX 1 20 0 236
Jupyter Notebook 1 0 315 21
YAML 1 1 4 18
-------------------------------------------------------------------------------
SUM: 15 466 917 2234
-------------------------------------------------------------------------------
gitinspector failed to run statistical information for the repository
Wordcount for paper.md
is 1296
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1093/bioinformatics/btz363 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.5281/zenodo.3509134 is OK
- 10.21105/joss.03021 is OK
- 10.1109/MCSE.2007.55 is OK
MISSING DOIs
- 10.1038/s41467-020-19015-1 may be a valid DOI for title: Benchmarking of cell type deconvolution pipelines for transcriptomics data
- 10.1101/354944 may be a valid DOI for title: Bulk tissue cell type deconvolution with multi-subject single-cell expression reference
- 10.1101/2020.10.01.322867 may be a valid DOI for title: Likelihood-based deconvolution of bulk gene expression data using single-cell references
- 10.1371/journal.pcbi.1006976 may be a valid DOI for title: Fast and robust deconvolution of tumor infiltrating lymphocyte from expression profiles using least trimmed squares
- 10.1038/nmeth.3337 may be a valid DOI for title: Robust enumeration of cell subsets from tissue expression profiles
- 10.1038/s41587-019-0114-2 may be a valid DOI for title: Determining cell type abundance and expression from bulk tissues with digital cytometry
- 10.1101/2020.02.21.940650 may be a valid DOI for title: AutoGeneS: Automatic gene selection using multi-objective optimization for RNA-seq deconvolution
- 10.1101/2022.11.11.516138 may be a valid DOI for title: Terminal differentiation of villus-tip enterocytes is governed by distinct members of Tgfβsuperfamily
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
Howdy @ritika-giri and @ManavalanG
Thanks for agreeing to review this submission.
The process for conducting a review is outlined above. Please run the command shown above to have @editorialbot generate your checklist, which will give a step-by-step process for conducting your review. Please check the boxes during your review to keep track, as well as make comments in this thread or open issues in the repository itself to point out issues you encounter. Keep in mind that our aim is to improve the submission to the point where it is of high enough quality to be accepted, rather than to provide a yes/no decision, and so having a conversation with the authors is encouraged rather than providing a single review post at the end of the process.
Here are the review guidelines: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html And here is a checklist, similar to above: https://joss.readthedocs.io/en/latest/review_checklist.html
Please let me know if you encounter any issues or need any help during the review process, and thanks for contributing your time to JOSS and the open-source community!
@LiBuchauer would you mind looking at those missing DOIs?
Hi, I added DOIs wherever possible. Thanks for agreeing to review @ritika-giri and @ManavalanG!
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@ritika-giri and @ManavalanG, can you please provide updates as to how your reviews are going?
@jmschrei I will get it completed this week :)
@jmschrei thanks for checking in - I will be done by Aug 20!
Apologies for the delay! I am working on the review and I will submit my review in the next few days.
@LiBuchauer In the interest of time, I have included my comments based on the tool installation and testing so far. I will follow up in the next few days with my further comments on the manuscript.
numpy
, etc., but their versions are not pinned. I would highly recommend pinning them (eg. numpy>=1.24
to eliminate version related errors.requirements.txt
file for pip-based installation and using environment.yml
file for conda-based installation. These files would also make it easier to specify the dependency versions. pip install .
needs to be executed irrespective of whether dependencies were installed via conda or pip. However It was unclear to me this step was needed when I chose the conda route. A minor reorganization would help.README.md
- "The repository contains example data from a publication on liver cancer microenvironments at examples/example_data/". Please cite the publication here. It is cited in file examples/cellanneal_quickstart.ipynb
, but mentioning it in the readme doc would greatly help the users.CONTRIBUTING.md
but mose users would likely not know about this file. So, linking it in the readme doc would make it easier for folks to come across the contributing and support guidelines.time cellanneal examples/example_data/mixture_data_liver_tumor.csv examples/example_data/signature_data_human_liver.csv output_directory
. Time output on my Mac: Time output: "674.31s user 179.48s system 738% cpu 1:55.67 total"
. I ran the command for a total of 3 times, and it took ~2mins every time. Please complete the following:
Thanks @ManavalanG, excellent points. I have started working on it as documented below.
Installation
[x] Installation requires several dependencies such as
numpy
, etc., but their versions are not pinned. I would highly recommend pinning them (eg.numpy>=1.24
to eliminate version related errors. --> addressed in 3b8081e[x] Minor suggestion. It would be a good idea to specify the dependencies in
requirements.txt
file for pip-based installation and usingenvironment.yml
file for conda-based installation. These files would also make it easier to specify the dependency versions. --> addressed in 3b8081eDocumentation
- [x] Minor suggestion.
pip install .
needs to be executed irrespective of whether dependencies were installed via conda or pip. However It was unclear to me this step was needed when I chose the conda route. A minor reorganization would help. --> addressed in 285b24c- [x] A snippet in
README.md
- "The repository contains example data from a publication on liver cancer microenvironments at examples/example_data/". Please cite the publication here. It is cited in fileexamples/cellanneal_quickstart.ipynb
, but mentioning it in the readme doc would greatly help the users. --> added in 857bfa5- [x] Fulfill this requirement from JOSS's checklist on documentation: "Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support". This is already accomplished by the included file
CONTRIBUTING.md
but mose users would likely not know about this file. So, linking it in the readme doc would make it easier for folks to come across the contributing and support guidelines. --> addressed in a944bc8Functionality
[x] Regarding performance, manuscript states that "Its typical processing time for one mixture sample is below one minute on a desktop machine". However, in my testing the tool on Macbook Pro 2019 (2.4GHz 8-core processor, 32GB mem), it took ~2mins. Command I used:
time cellanneal examples/example_data/mixture_data_liver_tumor.csv examples/example_data/signature_data_human_liver.csv output_directory
. Time output on my Mac:Time output: "674.31s user 179.48s system 738% cpu 1:55.67 total"
. I ran the command for a total of 3 times, and it took ~2mins every time. Please complete the following:
- [x] Mention the specs of the desktop machine where testing was performed --> addressed in 9275cca
- [x] Confirm the time taken to run the tool. --> I take the liberty to check this, because the text file contains 5 columns, i.e. 5 mixture samples, meaning that the statement with less than 1 minute per mixture sample holds on your machine
Here are my further comments, following my earlier feedback/comments.
software for omics data
. Justification on how it applies to other type of omics data need to be included, or title needs to be modified to reflect the manuscript's content (ie. transciptomics data).summary
section mentions two major optimization algorithms (least squares regression and support vector regression) used to solve the problem, and then the third paragraph discusses the challenges of using least squares regression. However such description on the topic of support vector regression is missing, and adding a note on them would guide the reader. Also, mentioning how the algorithm used in cellanneal
(spearman’s rank correlation coefficient) fits in between the above two algorithms would be helpful.cellanneal
. For example, how to install v1.0.0
? [Update Oct 2] This is minor and recommended but does not need to be included.Note: In my initial feedback provided on Aug 21, I had tagged my comment about tool dependency version and conda environment definition as minor suggestion
but upon further reflection, I removed that tag. Having tool versions defined would help with reliable installation and reproducibility. Please let me know if you have any questions :)
@ritika-giri how is your review coming?
@LiBuchauer have you had a chance to look at the comments raised by @ManavalanG?
@ritika-giri can you please provide an update? @ManavalanG how is your review coming?
@jmschrei My initial review is complete. I will resume once I hear back from the authors.
Thanks for the update. @LiBuchauer have you had a chance to look at the comments?
@LiBuchauer are you able to work on the above :point_up:, to avoid general delays, and to avoid loosing track of the reviewers, we recommend that you respond to reviewer comments/issues in a timely manor.
Hi all, very sorry for the delay, I am working on the last three points now. Will update.
Hi @ManavalanG , I addressed the remaining 3 points about the manuscript as detailed below, thank you for your input. Please let me know if anything else is lacking.
Manuscript
- [X] Title is really broad considering content of the manuscript. It discusses application in the field of transcriptomics, whereas title says
software for omics data
. Justification on how it applies to other type of omics data need to be included, or title needs to be modified to reflect the manuscript's content (ie. transciptomics data). --> I changed it as requested.- [X] The end of second paragraph in the
summary
section mentions two major optimization algorithms (least squares regression and support vector regression) used to solve the problem, and then the third paragraph discusses the challenges of using least squares regression. However such description on the topic of support vector regression is missing, and adding a note on them would guide the reader. Also, mentioning how the algorithm used incellanneal
(spearman’s rank correlation coefficient) fits in between the above two algorithms would be helpful. --> I added one sentence about SVR method problems, and also outlined what cellanneal does differently.- [X] Minor. Mentioning dataset used to obtain the figures used in manuscript could be helpful. --> Done
Tool installation
- [ ] Documentation is missing info on how to install a particular version of
cellanneal
. For example, how to installv1.0.0
? [Update Oct 2] This is minor and recommended but does not need to be included. --> I choose not to do this in the interest of time and also because there are currently no several versions :) hope it's okayNote: In my initial feedback provided on Aug 21, I had tagged my comment about tool dependency version and conda environment definition as
minor suggestion
but upon further reflection, I removed that tag. Having tool versions defined would help with reliable installation and reproducibility. Please let me know if you have any questions :) --> I had already adressed this
@jmschrei @Kevin-Mattheus-Moerman sorry again for the delay, I believe I have addressed all the points @ManavalanG has raised. Will wait for his feedback. Thanks everyone for your time!!
Thanks @LiBuchauer. @ManavalanG let me know what you think.
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@LiBuchauer Thanks for making the requested changes.
@jmschrei I have now completed the review. Please let me know if there are any questions :)
@ritika-giri do you have any other concerns about the paper? If not, please check off the remaining items in the list.
@ritika-giri checking in on this again
Sorry for the delay @jmschrei - I have 2 major comments:
Thanks @ritika-giri. @LiBuchauer can you respond to these concerns when you get a chance?
Hi good morning @ritika-giri,
regarding the first point, to my best knowledge there are no other packages that use simulated annealing as an optimisation procedure for optimising Spearman's R between experimental and computational method and I consider this an original idea we had. Obviously I will cite and discuss any such paper that you can name. A lot of the published methods rely on parametric distance metrics because of availability of fast optimization algorithms for them. In many omics contexts however, non-parametric methods have proven to be more stable, and thus, in cellanneal, we implemented a solution with Spearman’s rank correlation coefficient as a distance function and simulated annealing as an optimization procedure.
This also brings us to the second point. dual_annealing.py, as you note, is from scipy, which is clearly cited and marked. The reason that I copied the code over is that an important part of cellanneal is the GUI which comes as a single executable, and in order to keep this slim and reduce its start-up time, I wanted to remove scipy from the dependancies. Around this simulated annealing implementation, we provide a pipeline for importing bulk and signature data, extracting highly variable genes (yes, via the scanpy implementation, as is clearly marked), running the optimization and returning relevant plots as well as tabular results. The functionality is accessible via python, cli or GUI. We find our method to perform well and fast.
Overall, in my view, the most important points are 1) we present a new idea for performing cell type deconvolution based on a non-parametric distance function and 2) (maybe more relevant for JOSS), we provide this as an easy-to-use software which can also be employed by non-coding scientists and runs locally. So far, cellanneal has been cited by 3 peer reviewed publications (https://scholar.google.com/scholar?cites=10240263254458724390&as_sdt=2005&sciodt=0,5&hl=en), including two wholly unrelated to the cellanneal authors, and I know from people seeking support via email that is has also found its way into some biotech companies. It’s probably not worth anything, but the work that went into this is certainly (much) more than 3 months full time, though of course not all was spent on the code, but on conceiving and testing the method together with experimental biologists.
@ritika-giri hope this clarifies it a bit & @jmschrei hope you can make a call based on this explanation.
In general, I think it's better to import functions from other packages (even big ones like scipy) so that (1) they get the credit they deserve for writing the original algorithm and (2) any upstream improvements in performance can make their way into your package without any effort on your part. That being said, it's not a strict requirement and the authors do seem to clearly state where the code is from.
I personally believe that the substantial scholarly effort criteria have been met. Remember that substantial effort at JOSS focuses more on the development of the code rather than algorithmic novelty -- though it does need to fill a niche. @ritika-giri if you know of any specific publications using simulated annealing you think they should mention, I agree that they should include them, even if they aren't explicitly optimizing Spearman R.
@ritika-giri what do you think of the above responses?
Thank you for the clarifications and thoughtful inputs @jmschrei and @LiBuchauer. Happy to sign off on the review. I will provide some references for cell deconvolution using SA in a couple days.
Thank you @ritika-giri. I understand that you're busy, but keep in mind that this paper has been under review since June 30th. Any speed on your end would be greatly appreciated.
@ritika-giri :wave: please can you get back to @jmschrei ?
Okay, I'm going to go ahead and move forward with this without @ritika-giri's comments.
@editorialbot set <DOI here> as archive
@editorialbot set <version here> as version
@editorialbot generate pdf
@editorialbot check references
and ask author(s) to update as needed@editorialbot recommend-accept
@editorialbot generate pdf
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1038/s41467-020-19015-1 is OK
- 10.1093/bioinformatics/btz363 is OK
- 10.7554/eLife.26476 is OK
- 10.1038/s41467-018-08023-x is OK
- 10.1101/gr.272344.120 is OK
- 10.1038/s41467-020-15816-6 is OK
- 10.1371/journal.pcbi.1006976 is OK
- 10.1038/nmeth.3337 is OK
- 10.1038/s41587-019-0114-2 is OK
- 10.1016/j.cels.2021.05.006 is OK
- 10.1186/s13059-016-1028-7 is OK
- 10.1126/science.220.4598.671 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.5281/zenodo.3509134 is OK
- 10.21105/joss.03021 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.1371/journal.pbio.3002124 is OK
- 10.1101/2022.11.11.516138 is OK
MISSING DOIs
- None
INVALID DOIs
- Globalquantificationofmammaliangeneexpressioncontrol is INVALID
@LiBuchauer can you provide the DOI for the paper (e.g., from Zenodo) and the version of the code associated with this submission? And can you check out that invalid DOI?
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1038/s41467-020-19015-1 is OK
- 10.1093/bioinformatics/btz363 is OK
- 10.7554/eLife.26476 is OK
- 10.1038/s41467-018-08023-x is OK
- 10.1101/gr.272344.120 is OK
- 10.1038/s41467-020-15816-6 is OK
- 10.1371/journal.pcbi.1006976 is OK
- 10.1038/nmeth.3337 is OK
- 10.1038/s41587-019-0114-2 is OK
- 10.1016/j.cels.2021.05.006 is OK
- 10.1186/s13059-016-1028-7 is OK
- 10.1038/nature10098 is OK
- 10.1126/science.220.4598.671 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.1038/s41586-020-2649-2 is OK
- 10.5281/zenodo.3509134 is OK
- 10.21105/joss.03021 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.1371/journal.pbio.3002124 is OK
- 10.1101/2022.11.11.516138 is OK
MISSING DOIs
- None
INVALID DOIs
- None
Submitting author: !--author-handle-->@libuchauer<!--end-author-handle-- (Lisa Buchauer) Repository: https://github.com/LiBuchauer/cellanneal Branch with paper.md (empty if default branch): Version: v1.1.0 Editor: !--editor-->@jmschrei<!--end-editor-- Reviewers: @ritika-giri, @ManavalanG Archive: 10.5281/zenodo.10405043
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@ritika-giri & @ManavalanG, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jmschrei know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @ritika-giri
📝 Checklist for @ManavalanG