pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/
93 stars 36 forks source link

Presubmission enquiry for genomepy #56

Closed simonvh closed 1 year ago

simonvh commented 2 years ago

Submitting Author: Simon van Heeringen (@simonvh)
Package Name: genomepy One-Line Description of Package: simple and straightforward downloading and management of genomic data. Repository Link (if existing): https://github.com/vanheeringen-lab/genomepy


Description

Genomepy is designed to provide a simple and straightforward way to download and use genomic data (genome sequences and gene annotation). This includes (1) searching available data, (2) showing the available metadata, (3) automatically downloading, preprocessing and matching data and (4) generating optional aligner indexes. All with sensible, yet controllable defaults. Currently, genomepy supports Ensembl, UCSC, NCBI and GENCODE.

Scope

Genomepy streamlines data retrieval from genome repositories (for example genomepy install hg38 --annotation instead of searching the genome download link by hand, and downloading gene annotation from a separate source. It can read various formats and converts gene annotation to common and consistent format (data extraction and munging). All steps, download sources, identifiers are logged for reproducibility.

Any computational biologist or bioinfomatician in genomics. Genomepy makes it very easy to work with different genomes and to incorporate genome download in pipelines. Even when only working with one species (ie human or mouse), it streamlines genome download and use.

There are three that we are aware of: GoGetData and RefGenie can download genomes and genomic assets. However, these packages depend on predefined recipes, while genomepy can download any genome that is hosted on one of the major providers. In addition, genomepy provides a Python API to use the genomic data. In addition ncbi-genome-download focusses on bacterial and fungal genomes. Genomepy is agnostic to species.

An older version of Genomepy was already published in JOSS.

P.S. *Have feedback/comments about our review process? Leave a comment here

NickleDave commented 2 years ago

Hi @simonvh, welcome to pyOpenSci and thank you so much for your detailed presubmission inquiry!

Genomepy definitely looks in scope to me.

There is a bit of a question about how to handle the fact that it's already been reviewed by JOSS. In this case I see that the JOSS review was in 2017, and that you all have been steadily developing the package since that time.

Can you tell us a little bit more about what you are hoping to get out of a pyOpenSci review?

If it is something like "I feel like we would benefit from a new review after 5 years of development" then I think it could definitely be appropriate.

For sure I do not mean to discourage you.
It does seem like Genomepy would be a good fit and I am fairly confident we can find editors and reviewers from previous reviews of bioinformatics packages.

We are in a better position to put more process around this situtation now that our executive director @lwasser has been able to move pyOpenSci to a new fiscal host (we're just starting up reviews again after a pause). This is a good time for us to figure out how to handle it, if you can bear with us as we work it out on the fly 😄.

Genomepy looks like a great library and we'd like to help you if we can.

lwasser commented 2 years ago

Thank you @NickleDave !! @simonvh yup, we just want to understand your goals in submitting here. I think it would be great to support packages that have been reviewed by JOSS given we offer more python specific feedback in our reviews in addition to providing visibility of being a pyOS vetted package! So there are many wins here. But any insight into your goals for this review would be wonderful as we work through how we handle this type of submission! We look forward to hearing from you!

NickleDave commented 2 years ago

Hi again @simonvh, just following up here.

@lwasser and I had a chance to talk a little bit more about this presubmission today.

We do think in this case there's no problem with the previous review from JOSS, given that it was 2017 and it does look like you all have done significant development of the package since that time.

I am confident we can find appropriate editors and reviewers by reaching out contributors involved with previous bioinformatics and -omics packages we have reviewed.

Please feel free to start a new issue for a full submission--once you do we can close this one--and just let us know if you have any questions.

simonvh commented 2 years ago

Hi @NickleDave and @lwasser, sorry for the delay in answering! Thanks for the input here. Just to add to the discussion: my aim was primarily to 1) gain more visibility for genomepy and 2) to support pyOpenSci, which I didn't know about and heard about through Twitter. In addition to considering pyOpenSci, we have also recently submitted genomepy as a journal paper (preprint here: https://arxiv.org/abs/2209.00842).

From your reactions, it seems that the review process is more extensive than I had thought, which is good news as it would focus more on the technical side and we're always eager to learn there. We have indeed significantly expanded on genomepy as originally published in JOSS. If you indeed think it would be OK from your side, then I would be happy to start a full submission.

NickleDave commented 2 years ago

Thank you @simonvh.

We're glad to help gain visibility for genomepy and appreciate your support.

[I]t seems that the review process is more extensive than I had thought. We have indeed significantly expanded on genomepy as originally published in JOSS.

Great, It sounds like we're on the same page that a review would be appropriate given the amount of time that's passed and the development that's been done :+1:

And just to clarify what pyOpenSci does that is different:

NickleDave commented 2 years ago

In addition to considering pyOpenSci, we have also recently submitted genomepy as a journal paper (preprint here: https://arxiv.org/abs/2209.00842).

Ok understood.
We have had pre-prints submitted before (see e.g. #26 ) but I think this is the first time someone has simultaneously submitted to pyOpenSci and a journal.

We just want to be sure we're handling this the right way.
@lwasser is checking in with contacts at JOSS and rOpenSci.
We will get back to you within a week.

There's also a bit of a question of whether people would cite the journal article, the version of the package that pyOpenSci reviewed, or both.
(Both can work! If you use a CITATION.cff :slightly_smiling_face:)

NickleDave commented 1 year ago

Hi @simonvh, one more update on this. In regards to publishing in other venues, we will be adopting the approach that rOpenSci takes, to avoid potential conflicts. From their guide:

.1.1 Publishing in other Venues

We strongly suggest submitting your package for review before publishing on CRAN or submitting a software paper describing the package to a journal. Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions. We do not consider previous publication on CRAN or in other venues sufficient reason to not adopt reviewer or editor recommendations. Do not submit your package for review while it or an associated manuscript is also under review at another venue, as this may result on conflicting requests for changes.

I apologize that we did not previously have clear guidance in place. @lwasser is in the process of adding this language to our guidebook now.

Long story short:
We really appreciate that you want to support pyOpenSci. We would definitely be more than happy to provide you with a review after the review process ends at the journal where you have currently submitted, especially if we can help you add a shiny new pyOpenSci badge to your repo and get more visibility for genomepy.

Please let me know if that's clear. I appreciate your understanding and patience as we figure out this process on the fly.

simonvh commented 1 year ago

HI @NickleDave, thanks and no worries! I understand that this all needs to be figured out. We'll wait, and I'll check back in at a later stage.

Just as an aside, I would personally love for venues like JOSS and pyOpenSci to also be sufficient as publication. At least going by the JOSS review, the review process is much more useful, technically informative and geared towards improving a software package. Less adversarial than the traditional review process as well. However, as far as visibility goes, it is (sadly) in the biosciences still no match for a Pubmed-indexed journal. So in that sense, double citations would I guess still be the way to go.

I'll close the thread, we can always re-open in the future. Thanks for thinking along!

lwasser commented 1 year ago

@simonvh thank you so much for this feedback. I agree. RopenSci actually has agreements with several publications to allow their review to suffice for that publication's review. Then the publication can focus on the paper part and we focus on the technical part. Please feel free to mention this to the publication that you are working with as we are open to it. We have that type of agreement with JOSS - our review is much more python specific and we offer more things to maintainers such as supporting longer term maintenance and visibility in the scientific python community. So JOSS accepts our review as a part of their process if the package is in scope for them.

SO in short - bare with us as we figure things out. we've been running peer review for a few years but are just getting organized as a new independent organization. As such we have many partnerships to make and much to do!

You are WELCOME to submit to us after your review is done if you want ANOTHER review and the visibility of our platform behind you. And in the meantime send whomever our way to start discussions around the double review vs just accepting our reviews! leah at pyopensci.org is a good way to reach me if you have any other questions or thoughts as well!