openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
697 stars 36 forks source link

[REVIEW]: BioPandas: Working with molecular structures in pandas DataFrames #279

Closed whedon closed 7 years ago

whedon commented 7 years ago

Submitting author: @rasbt (Sebastian Raschka) Repository: https://github.com/rasbt/biopandas Version: v0.2.2 Editor: @pjotrp Reviewer: @krother Archive: 10.5281/zenodo.804030

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/d611893a9442b7dd421e2a1622a3da18"><img src="http://joss.theoj.org/papers/d611893a9442b7dd421e2a1622a3da18/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/d611893a9442b7dd421e2a1622a3da18/status.svg)](http://joss.theoj.org/papers/d611893a9442b7dd421e2a1622a3da18)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer questions

Conflict of interest

General checks

Functionality

Documentation

Software paper

whedon commented 7 years ago

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks for JOSS. @krother it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As as reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all JOSS reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
krother commented 7 years ago

The PDB parsing works out of the box and does not require further explanation. It is a killer app in itself. I found a (probably small) issue in the RMSD calculation feature. At this moment, I am unable to establish the functional claims "reading and parsing millions of small molecule structures (from multi-MOL2 files)" and "filtering molecules by the presence of functional groups" based on the documentation.

krother commented 7 years ago

DOI references for the PDB could be: Nature Structural Biology 10, 980 (2003); doi: 10.1038/nsb1203-980 and The Protein Data Bank Helen M. Berman John Westbrook Zukang Feng Gary Gilliland T. N. Bhat Helge Weissig Ilya N. Shindyalov Philip E. Bourne Nucleic Acids Res (2000) 28 (1): 235-242. DOI: https://doi.org/10.1093/nar/28.1.235

krother commented 7 years ago

For pandas, a book written by Wes exists (http://shop.oreilly.com/product/0636920023784.do), but it has no DOI. I'd leave the pandas and MOL reference as they are.

rasbt commented 7 years ago

Thanks for the feedback!

The PDB parsing works out of the box and does not require further explanation. It is a killer app in itself. I found a (probably small) issue in the RMSD calculation feature.

I saw a note about that issue in the biopandas repo and will take a look at it tonight and hopefully resolve it.

At this moment, I am unable to establish the functional claims "reading and parsing millions of small molecule structures (from multi-MOL2 files)" and "filtering molecules by the presence of functional groups" based on the documentation.

When I wrote this I had another paper in mind that I was working on recently. I am just waiting for the co-author feedback and then the manuscript is going to be submitted to JCIM ~ next week. But I guess it will probably take a bit longer until that's published and citable. There, I used biopandas as an engine to perform filtering by functional groups (spatial distance between functional groups, charge range, and hybridization). I only have a brief section on that in the docs here:

http://rasbt.github.io/biopandas/tutorials/Working_with_MOL2_Structures_in_DataFrames/#parsing-multi-mol2-files

In this example, you can think of the # do some analysis placeholder as data frame operations like: does this molecule have a O.2 hybridized and S.o2 hybridized atom? If yes, are those within a distance of X A ... and so forth using and chaining in-built pandas/biopandas functions. I am happy to extend this section with a simple case if necessary. Would that be okay with you? I probably won't do the "million" there because processing millions of molecules could take a few hours (it's a short time for a research project, but maybe a bit to involved for a tutorial :P)

rasbt commented 7 years ago

DOI references for the PDB could be: ...

Thanks, updated those references!

krother commented 7 years ago

argh, I didn't spot the menu bar on top on the documentation page. It's quite impressive and complete as it is, thanks! :+1:

rasbt commented 7 years ago

I found a (probably small) issue in the RMSD calculation feature.

This should be fixed in the master branch now (https://github.com/rasbt/biopandas/pull/34). For JOSS, I assume I need to make a new version release so that this change takes effect? It's not a problem, I think I should hold off a bit though, in case there's something else that needs to be addressed. (Once everything looks fine, I can make a new stable release, i.e., 2.1.0 -> 2.1.1 or 2.2.0). Is that okay?

krother commented 7 years ago

Checked the changed version by installing with pip directly from github. Great you put this in the documentation! And it works, too.

krother commented 7 years ago

Created a PR where I indicated a possible location for the DOI in the PDB reference. No more issues from me. Congratulations to this contribution, it is time to get the word out!

rasbt commented 7 years ago

Thanks @krother . Just merged the DOI PR, and I am happy to hear that everything looks fine now :)

krother commented 7 years ago

OK, review done.

pjotrp commented 7 years ago

@krother thanks! @arfon must be a record here. Ready to R&R.

arfon commented 7 years ago

@rasbt - Could you move the references you currently have in the paper.md file into a paper.bib file and cite them directly please? (You can read how to do that here)

When you've done that could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive? I can then move forward with accepting the submission.

rasbt commented 7 years ago

@arfon no problem.

I moved the references to a separate bibtex file, paper.bib.

When you've done that could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive?

Here's the link for the 0.2.1 version that was reviewed:

DOI

https://zenodo.org/record/574879

This existed prior to the submission though and thus, it doesn't include the paper.md and paper.bib files.

I made a new biopandas version (v0.2.1 -> v0.2.2) for zenodo incorporating the paper files and also including a small fix that came up during the review process. The zenodo link and doi:

DOI

https://zenodo.org/record/804030

arfon commented 7 years ago

@whedon set 10.5281/zenodo.804030 as archive

whedon commented 7 years ago

OK. 10.5281/zenodo.804030 is the archive.

arfon commented 7 years ago

@krother many thanks for your rapid review here and to @pjotrp for editing this submission ✨

@rasbt - your paper is now accepted into JOSS and your DOI is http://dx.doi.org/10.21105/joss.00279 ⚡️:rocket: :boom:

rasbt commented 7 years ago

That's awesome! Thanks a lot, everyone, for all the work and the very seamless process!