[REVIEW]: Ethome: machine learning for animal behavior

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@benlansdell@adi3<!--end-editor-- Reviewers: @imcatta, @neuromusic, @KonradDanielewski Archive: 10.5281/zenodo.10680136

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/0472dab158806827a83da79e602e16e4"><img src="https://joss.theoj.org/papers/0472dab158806827a83da79e602e16e4/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/0472dab158806827a83da79e602e16e4/status.svg)](https://joss.theoj.org/papers/0472dab158806827a83da79e602e16e4)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@imcatta & @neuromusic, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @adi3 know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @imcatta

📝 Checklist for @KonradDanielewski

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 1 year ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.12 s (732.9 files/s, 197646.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JavaScript                      26           1013           1272           4908
HTML                            11           3327              8           4705
Python                          23            863            836           3036
Markdown                        14            996              0           1356
TeX                              1              0              0            168
Ruby                             1             28             12            106
YAML                             4             10             10            104
XML                              1              0              0             53
Jupyter Notebook                 1              0            225             51
make                             1             11             21             42
JSON                             2              0              0             24
TOML                             1              0              0              6
-------------------------------------------------------------------------------
SUM:                            86           6248           2384          14559
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 1 year ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.7554/eLife.63377 is OK
- 10.48550/ARXIV.2011.13917 is OK
- 10.1101/2020.05.14.095430 is OK
- 10.1101/2020.04.19.049452 is OK
- 10.5281/zenodo.6669074 is OK
- 10.1038/s41386-020-0776-y is OK
- 10.1016/j.conb.2019.10.008 is OK
- 10.7554/eLife.63720 is OK
- 10.1038/s41467-021-25420-x is OK
- 10.1101/2022.11.04.515138 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot commented 1 year ago

Wordcount for paper.md is 931

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

imcatta commented 1 year ago

Review checklist for @imcatta

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/benlansdell/ethome?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@benlansdell) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

adi3 commented 1 year ago

Folks, apologies for the delay in pre-review phase but we'll pick up speed during this review. Thanks for your patience!

👋🏼 @benlansdell this is the review thread for the paper. All of our communications will happen here from now on.

👋🏼 @imcatta, @neuromusic - you both should generate your checklists with the JOSS requirements by running @editorialbot generate my checklist. As you go over the submission, please check the items that you feel have been satisfied and let the author know where further work needs to be done.

Here is a little more context for first-time reviewers :) - The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention #5623 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use @editorialbot to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@adi3) if you have any questions/concerns. Thank you for all your help!!

neuromusic commented 1 year ago

@editorialbot commands

editorialbot commented 1 year ago

Hello @neuromusic, here are the things you can ask me to do:


# List all available commands
@editorialbot commands

# Get a list of all editors's GitHub handles
@editorialbot list editors

# Check the references of the paper for missing DOIs
@editorialbot check references

# Perform checks on the repository
@editorialbot check repository

# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist

# Set a value for branch
@editorialbot set joss-paper as branch

# Generates the pdf paper
@editorialbot generate pdf

# Generates a LaTeX preprint file
@editorialbot generate preprint

# Get a link to the complete list of reviewers
@editorialbot list reviewers

neuromusic commented 1 year ago

Thanks @adi3!! I'm going to be unavailable through August 1st, but will be able to start my review then.

adi3 commented 1 year ago

@neuromusic that will work out just fine, thank you!

Kevin-Mattheus-Moerman commented 1 year ago

@neuromusic to get started, the command you were looking for was: @editorialbot generate my checklist

imcatta commented 1 year ago

Hi @benlansdell! I've completed my review. Here are my comments.

The package is interesting and useful. While running through the code, I had the feeling that it is well structured and of good quality. I also appreciated that the package is made available through PiPY and that there is a pipeline for automated testing. However, I think there is still room for improvement in the documentation. Here are my specific comments.

General checks

The project includes some sample data from animals, but no evidence is provided that this data was collected in accordance with the JOSS Policy.

Functionality

There are two methods to get example data: get_sample_nwb__paths(), which returns NWB data, and get_sample_data_paths(), which returns DLC and BORIS. I suggest renaming the latter to something that recalls DLC and BORIS for consistency.

Documentation

I struggled a little in following the code presented in the README.
- dataset.features.add('cnn1d_prob') If I run this on the nwb example dataset, as suggested above, it tells me that DataFrame doesn't have necessary columns to compute this set of features.. Also, I can’t find which columns are needed. Please consider stating in the documentation what columns are needed and how to get those in the sample dataset.
- cross_val_score(..., dataset.ml.labels, ...). No labels are present in the nwb example dataset, so this is not working with the dataset proposed above.
- Please add package requirements (e.g. Python versions supported) to the README.
- “For this, you must have labeled your body parts in a certain way (refer to How To)”. Please add the link to the How to section.
- Please add a link to the web documentation in the README.
It is stated that “dataset.features.add('cnn1d_prob')Uses a pretrained CNN to output probabilities of 3 behaviors (attack, mount, social investigation).”. More information on this is needed. How was this CNN built? Also, I found out that the repository contains 5 pre-trained models, but no information on that is provided.
Some users may not be fully confident with the DLC, SLEAP, NWB, BORIS and NWB file formats. Please explain the different data input formats supported in more detail.
Guidelines for report issues or problems with the software and seek support are missing.
In How To guide, section “3a In-built support for resident-intruder setup”, more information on what is MARS and context should be provided.

benlansdell commented 1 year ago

Thanks for the review @imcatta. For some of these points, I'll create issues in the package repo and let you know when I've addressed them.

adi3 commented 1 year ago

@imcatta thanks for the review so far! @benlansdell looking forward to seeing your changes and getting the first reviewer's checklist complete. @neuromusic should be back soon and will begin the second review sometime this week.

adi3 commented 1 year ago

@neuromusic - could you kick off your review please? :)

Kevin-Mattheus-Moerman commented 1 year ago

@neuromusic thanks for agreeing to help with this review. If you are still able to help, please get started at your earliest convenience to avoid a delay with this submission. Thanks!

Kevin-Mattheus-Moerman commented 1 year ago

@benlansdell could you provide an update to say if you have dealt with the issues raised by @imcatta? Thanks

Kevin-Mattheus-Moerman commented 1 year ago

@benlansdell please respond to the above :point_up:

To avoid delays it would be great if you could respond to queries in a timely manor. If instead you are no longer interested in pursuing a JOSS publication please do let us know.

Kevin-Mattheus-Moerman commented 1 year ago

@adi3 please can you pick this one up again. It looks like we may need to find an alternative reviewer, since @neuromusic has not responded.

adi3 commented 1 year ago

@Kevin-Mattheus-Moerman for sure! I think we need to first wait for @benlansdell to resolve outstanding issues mentioned by the reviewer to ensure there is motivation to take this publication forward.

benlansdell commented 1 year ago

@adi3 and @Kevin-Mattheus-Moerman I am still motivated to take the publication forward at JOSS! Let me address these issues this week. Thanks for keeping this moving.

benlansdell commented 12 months ago

Hi @adi3, @Kevin-Mattheus-Moerman and @imcatta, I have made changes based on @imcatta's feedback. The responses to these comments are detailed in PR #4 and PR #6 in the source repo. I welcome further discussion there about whether they fully satisfy @imcatta's feedback.

One outstanding issue is the use of animal data in the repo. Some of the sample data is from a manuscript in preparation from my previous lab. I assume that because it is as-yet unpublished I need to either:

Provide all the relevant experimental methods in this repo, OR
Remove that sample data and replace it with something already published

imcatta commented 12 months ago

Hi @benlansdell! Thanks for the update. I think the project is moving in the right direction, but there are still a couple of things that can be clarified.

Human and animal research: as you mentioned, this is still an open point at this time
Community guidelines: I'm having trouble locating the guidelines.

benlansdell commented 11 months ago

@imcatta , The community guidelines are CODE_OF_CONDUCT.md and CONTRIBUTING.md. Let me know if you think something additional is needed. Or if they could be more clearly pointed to in the repo.

For the animal data, I think the best thing is to rework the sample data to remove any unpublished data. Thanks for bringing this point to my attention.

imcatta commented 11 months ago

My bad 😅 I didn't find them in the readme or the documentation page, so I mistakenly assumed they were missing.

At this point, the only thing left on my checklist is human and animal research. The solution you proposed sounds good to me.

benlansdell commented 11 months ago

Hi @imcatta, I have added a statement to the readme about the source of the animal data. Let me know if you think this is sufficient details. In particular, the data complies with St Jude's Institutional Animal Care & Use Committee. See here: https://github.com/benlansdell/ethome/issues/5

imcatta commented 11 months ago

Hi @benlansdell, I think this solves the problem. I have no more items left on my checklist and I recommend accepting this submission!

benlansdell commented 11 months ago

@imcatta thanks for your input. Happy to discuss anything else if you think of it.

@adi3 , @Kevin-Mattheus-Moerman let me know if you'd like me to recommend further potential reviewers to keep this moving forward.

benlansdell commented 11 months ago

@adi3 , @Kevin-Mattheus-Moerman I could suggest MatthewBCooke as one additional option.

Kevin-Mattheus-Moerman commented 11 months ago

@adi3 :wave:

adi3 commented 11 months ago

@MatthewBCooke can you take up this review? We already have one thorough review complete so the paper should be quite close to approval already. Would appreciate your help!

adi3 commented 11 months ago

@benlansdell feel free to suggest 3-5 more reviewers since their availability might be limited. Thank you!

benlansdell commented 11 months ago

Sure, you can also try: jeylau, CBroz1, KonradDanielewski

adi3 commented 10 months ago

@jeylau @CBroz1 @KonradDanielewski could you help us carry out this review? We already have one thorough review complete so the paper should be quite close to approval already. Thanks in advance!

KonradDanielewski commented 10 months ago

@benlansdell @adi3 Thank you for considering me. I'm quite busy at the moment, but if the review deadline would be after Christmas I'd be ok with that.

CBroz1 commented 10 months ago

I'm in the same boat - I wouldn't be able to give this my full attention until the first week of Jan. Lmk if that timeline works for you

adi3 commented 10 months ago

@KonradDanielewski no problem at all, would be great if you could get to this complete within Jan '24. I'll assign you as the reviewer. Thanks!

adi3 commented 10 months ago

@editorialbot add @KonradDanielewski as reviewer

editorialbot commented 10 months ago

@KonradDanielewski added to the reviewers list!

adi3 commented 10 months ago

@KonradDanielewski please see this message for a few review instructions. Cheers!

KonradDanielewski commented 10 months ago

Review checklist for @KonradDanielewski

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/benlansdell/ethome?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@benlansdell) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

adi3 commented 9 months ago

@KonradDanielewski thank you for starting your review! Let me know if I can help in any way moving forward. Thanks!

KonradDanielewski commented 9 months ago

Hi @benlansdell and @adi3, I have completed my review.

I have some comments regarding the codebase styling, a minor oversight in the example and a general comment on installation, plus some usage issues I encountered when testing it out:

In the example sample_workflow it still calls get_sample_data_paths instead of get_sample_data_paths_dlcboris
That may just be a personal preference but I like when a yaml file is provided to create a separate environment with conda and it contains all the necesarry dependencies - for instance the missing FFmpeg
In io.py read_DLC_tracks uses splitext but compares to file extensions without the . hence if statement is never True and cannot load h5 files.
That one is strange, but when running create_dataset the animal_renamer has to be provided as None (even though it's the default value) if not being used, otherwise, at least for me it throws an error like a dict was passed instead of the file path. Happens in read_boris_annotation here when loading the BORIS csv file.
For the speeds feature for all my files first five indices are not calculated - which stems from here where dt = features_df['time'].diff(periods = n_shifts) . Same will happen for compute_part_speed here. Because periods=5 first five indices have nothing to be compared to resulting in NaNs, propagating them further in the feature extraction. Then modelling crashes as there are NaNs in the data.
Lastly, about the styling, this is rather minor and quickly fixable - in some places type hints are provided in other they are not and in some function definitions arguments are in separate lines while in others they are in one line (as far as I noted these are in the features part of the module).
More examples could be provided for different inputs so that the users can quickly navigate or follow an example notebook by just pointing to their directories with data.
Maybe I missed something in the How to guide but I don't see a clear indication of which features can be added for non-MARS data

Other than that I really like it and think it will be a great resource for many people working with pose estimation data. All the issues are rather minor and fixing them should be quick and easy as the code is very nicely structured and readable.

Let me know if there is anything else you require

benlansdell commented 9 months ago

Hi @KonradDanielewski, thanks for the review! These are some good points. I have created issues in the ethome repo to track addressing each of these. I'll get started on this.

adi3 commented 9 months ago

@benlansdell do let know once you've addressed all the points raised by @KonradDanielewski and I'll be happy to push your submission towards publication. Thanks!

benlansdell commented 9 months ago

Thanks @adi3 , I've started working through them. Should be able to get the remaining items addressed on the weekend.

benlansdell commented 9 months ago

Thanks again @KonradDanielewski for providing feedback. I've addressed almost all of the issues you raised. One remaining one is listed here: https://github.com/benlansdell/ethome/issues/10

It's the issue you pointed out with having to provide animal_renamer = None, even though it's already None by default. I can't seem to reproduce this one. Are you able to provide some short code that produces the error? (Possibly with a BORIS file, if you used one that's not included in the sample data in the package) Thanks!

KonradDanielewski commented 9 months ago

Sure, I'll try to run it again today

KonradDanielewski commented 9 months ago

It's the issue you pointed out with having to provide animal_renamer = None, even though it's already None by default. I can't seem to reproduce this one.

Hi, so I tried to replicate it and also couldn't... Maybe let's consider this addressed. If I encounter it again I will log what happened exactly and make an issue - this could've also been some weird error on my part as a I was applying quickfixes to the code while testing to provide explanations

openjournals / joss-reviews