[JOSS REVIEW] Paper - Githubissues

As part of https://github.com/openjournals/joss-reviews/issues/3702

Summary

I think the summary nicely conveys in layman terms, the aim of the package, which is to allow for different methods in biopsychology to be made accessible and performed systematically using a single package. However, the third paragraph in the summary which talks about the use cases of these data isn't the most relevant to building an argument for why BioPsyKit is needed. Rather than a generic description of when the data is collected/analysis types, the purpose of this package may be made more convincing by stating what are the gaps in current biopsychology methodologies and their consequences. Although it is briefly mentioned (in the statement of need) that researchers conventionally use different assessment modalities, I think it is the implications of this approach that need to be highlighted here. For example, how does using different assessment modalities impact reproducibility in research? How does combining tools in BioPsyKit facilitate/streamline analyses and why is it important? At the analysis level, are the algorithm pipelines made accessible to researchers, or are they too opaque, and how does BioPsyKit address this? These are some questions that I think would be important for addressing the gap BioPsyKit is specifically fulfilling.

State of the field

Currently, other related packages in the field are not yet acknowledged in the paper. A good start may be to compare with, for example, more signal-specific packages, like pyHRV and antropy, both of which have overlapping functionalities with BioPsyKit as they focus on ECG and EEG signals respectively. Apart from mentioning neurokit2 as a dependency, it may also be helpful to further elaborate on how BioPsyKit’s aim and functionalities are distinct/and or complementary, given that neurokit similarly accommodates for a variety of biosignals.

On a related note, I do think that the scope of BioPsyKit is currently not very well-defined because of the multiple modalities included (even though this can also be perceived as a strength of the package). I understand that from the offset, it is explicitly stated that the purpose was to combine tools in biopsychology. However, I am not sure what the added benefits are of having, for example, questionnaire/protocols implementations packaged together with eletrophysiological data processing (i.e., is this just for convenience?). If the aim is to facilitate simultaneous processing of different data, then I think some intregration of functionalities needs to be available, as the existing modules seem quite independent as of now alongside other miscellaneous utilities like data wrangling and stats implementation - because if not, users of BioPsyKit would have processing/analysis pipelines as lengthy as if they were to use signal-specific packages that are already well-established. 🤔 Just some thoughts!

Quality of writing

Overall, I think the paper flows nicely and descriptions are concise and straight to the point. I just have two minor comments: 1) Figure 1 nicely depicts the structure of BioPsyKit, but I have some questions regarding sleep_wake and sleep_endpoints. If they differ based on the former detecting when individuals wake up, and the latter detecting when sleep ends, are they functionally equivalent? Providing some elaboration of these submodules’ features in this figure may help here. Additionally, I realized from the repo that data_handling isn’t a submodule on its own like the others, as it seems like its listed functionalities are subsumed under the utils submodule. Perhaps to change data_handling to utils to be consistent with the structuring of submodules in this figure!

2) The need for psychological protocols in a software package is not very clear to me yet. I understand that this may be a word limit issue, but I think some clarity of their practical functionality can be provided, on top of just stating what protocols are available. Intuitively from the code, it seems like their purpose is to provide a “data structure” for the organization of different modalities of experimental data – if so, I think this is important to state in the paper.

Let me know what you think! :)

mad-lab-fau / BioPsyKit

[JOSS REVIEW] Paper #12

Summary

State of the field

Quality of writing