pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/
90 stars 35 forks source link

Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases #67

Closed Robaina closed 1 year ago

Robaina commented 1 year ago

Submitting Author: Semidán Robaina (@Robaina) All current maintainers: @Robaina Package Name: Pynteny One-Line Description of Package: Query sequence database by HMMs arranged in predefined synteny structure Repository Link: https://github.com/Robaina/Pynteny Version submitted: v0.0.5 Editor: @arianesasso Reviewer 1: @Batalex Reviewer 2: @c-thoben
Archive: DOI JOSS DOI: DOI Version accepted: 1.0.0 Date accepted (month-day-year): 03-10-2023


Description

Pynteny is Python tool to search for synteny blocks in (prokaryotic) sequence data through HMMs of the ORFs of interest and HMMER. By leveraging genomic context information, Pynteny can be employed to decrease the uncertainty of functional annotation of unlabelled sequence data due to the effect of paralogs. Pynteny can be accessed (i) through the command line, (ii) as a Python module, or (iii) as a (locally served) web application.

Scope

Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.

Pynteny's main objective is to provide a means to query NGS (unannotated) sequence databases, such as metagenomic/metatranscriptomic datasets using syntenic blocks (i.e. spatial arrangements of genes) rather than single target genes/protein domains. In this sense, I would classify Pynteny within Data Extraction.

Pynteny was designed to be used by researchers working with large, unannotated sequence databases, such as those typically encountered in metagenomic analyses. It can be accessed through a command line interface or easily integrated into pipelines as a Python package. Pynteny can also be used through a graphical interface running locally in the browser, which is more suitable for educational purposes.

To the extent of my knowledge, there isn't any Python package that provides the functionality provided by Pynteny.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication options

I had submitted this package for publication at JOSS prior to pyOpenSci. The submission is currently under consideration for scope in this issue: https://github.com/openjournals/joss-reviews/issues/4978

JOSS Checks - [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [x] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [x] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [x] The package is deposited in a long-term repository with the DOI: [![DOI](https://joss.theoj.org/papers/10.21105/joss.05289/status.svg)](https://doi.org/10.21105/joss.05289) *Note: Do not submit your package separately to JOSS*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text-based review. It will also allow you to demonstrate addressing the issue via PR links.

Code of conduct

Please fill out our survey

P.S. *Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

[Editor and review templates can be found here]: https://www.pyopensci.org/software-peer-review/how-to/editors-guide.html and https://www.pyopensci.org/software-peer-review/how-to/reviewer-guide.html

NickleDave commented 1 year ago

Thank you @Robaina for submitting Pynteny to pyOpenSci. As you noted, we have already discussed on #65 and you have done extensive development to prepare for submission.

Here are the initial editor checks:



Editor comments

The author has done a ton of work to get ready for submission. Pynteny is looking really good. We will need a contributing.md before the review finishes but that's the only thing I really see missing right now.

Thank you again @Robaina. We will get an editor and reviewers as soon as possible.
I need to confirm timelines with @lwasser but I would expect it might take a bit longer because of the holidays, my guess is two weeks. Will let you know for sure by the end of this week, but wanted to make sure you knew we have seen your submission and we are beginning the review.

Robaina commented 1 year ago

Hi @NickleDave,

thanks for accepting the submission! I have included a "CONTRIBUTING.md" in the root directory of the repo, also moved instructions on how to set a developer environment to that file. Looking forward to the review process -- Btw, I think I did fill out the initial onboarding survey, please confirm.

Best, Semidán

NickleDave commented 1 year ago

Perfect, thank you Semidán!

Following up on my last message: we do expect to get review started within the next two weeks. People are off-line because of the holidays but we will get the ball rolling now.

arianesasso commented 1 year ago

Dear @Robaina, nice to meet you!

I just wanted to say hi 👋🏼 and inform you that I am pleased to be the editor for this review. I see that you and @NickleDave already did great work during the pre-submission phase! We are recruiting the reviewers now, and I will update you once they are onboard.

Robaina commented 1 year ago

Hi @arianesasso, nice to meet you too!

Awesome. Thanks for letting me know and for being the editor. Looking forward to the review! (und guten Rutsch ins neue Jahr)

arianesasso commented 1 year ago

Dear Robaina,

I hope you had a great start to the year as well! I am sorry for the delay. We are still looking for our second reviewer.

However, I would like to introduce you to @Batalex, who is already on the job as the first reviewer 😄 . We have previously talked about the package, and he already has access to the reviewer's guide, the reviewer template, and the Python packaging standards detailed in the packaging guide. So, we will be able to continue moving 🚀 .

And please feel free to ping me at any time if you need help or have questions!

Robaina commented 1 year ago

Hi @arianesasso,

no problem, thanks for the update :) Hi @Batalex, thanks the review!

Batalex commented 1 year ago

Hey there! I am almost done with my review. No promises, but I think I can deliver this weekend. Looking forward to discussing it with you! 🙌

Batalex commented 1 year ago

Here comes a big one! @Robaina @arianesasso

Foreword

I want to thank both the author and the editors from pyOpenSci for the opportunity to tackle this package review. The following comments reflect my experience as I maintain private packages, and all remarks come with pynteny's best interest at heart.

Some comments are regrouped at a "general" level if they concern the project structure or several files. If I encountered a situation that I had already pointed out on another file, I did not comment again. However, I encourage taking into account the remarks from one file to the others.

Several comments I made are only valid for the submitted version (0.0.5) and were fixed in the meantime. Nonetheless, I will mark off what has been done later.

With that being said, let us get started!

I used the black cat🐈‍⬛ to highlight my personal thoughts in the review template below.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

🐈‍⬛ The submitted version uses a personal conda channel in README, which is something I would not recommend. However, it uses bioconda in the main branch, so it is all good.

🐈‍⬛ There are some improvements to be made regarding running the examples locally, see below.

🐈‍⬛ In Pynteny 0.0.5, contributions guidelines are included in the README file. A more detailed CONTRIBUTING file has been added since then.

🐈‍⬛ I noticed that two of the three links to the Pynteny repo use the HTTP scheme, and the third uses HTTPS.

Readme file requirements The package meets the readme requirements below:

The README should include, from top to bottom:

🐈‍⬛ Additional badges have been added since the 0.0.5 release.

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

🐈‍⬛ I ran the package in WSL since Apple silicon is not supported. The installation went smoothly, except for the really long resolution time. I am glad to see that the main branch encourages to use mamba.

Functionality

🐈‍⬛ As far as I can tell, it seems to do what it claims.

🐈‍⬛ NA, no performance claim to be found in the documentation.

🐈‍⬛ Tests covers 45% of pynteny. The coverage is heterogenous: the application and CLI parts of the package are less tested than the rest.

🐈‍⬛ Unit tests did not run on CI as of 0.0.5. This has changed since then. The documentation website is built and deployed using CI, nice!

🐈‍⬛ The code base does not fully follow PEP 8. See below for more information.

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: ~8 hours


Review Comments 🐈‍⬛

General comments:

CLI

Edits

Robaina commented 1 year ago

Hey @Batalex ,

thanks so much for your very detailed review! I quickly run through your comments and I see that they will be very useful. I'll address them carefully in the following days. But wanted to thank you now.

arianesasso commented 1 year ago

@Batalex, what a beautiful and thoughtful review! I really enjoyed the 🐈‍⬛ haha and, of course, all the meaningful recommendations you made! Thank you for all your effort 🙏🏼.

@Robaina, hopefully, we will have the second reviewer onboarded soon. I will keep you posted!

arianesasso commented 1 year ago

Hi @Robaina, I want to introduce you to our second reviewer @c-thoben. She will be starting her first review with us 🥳 . Keep tuned for more updates soon!

Robaina commented 1 year ago

Hi @arianesasso, great, thanks for letting me know. Hi @c-thoben, thanks for offering to review!

Batalex commented 1 year ago

Hey @Robaina, if you would like, I can free some time in case you want a reviewer for your upcoming PRs on Pynteny. This may be a little easier to discuss the suggested changes.

Robaina commented 1 year ago

Hi @Batalex ,

thanks for the offer! Will do that for the ones I'm currently working on.

arianesasso commented 1 year ago

Great job @Robaina! Thank you also @Batalex, for offering more time to help with the PRs 🙏🏼

arianesasso commented 1 year ago

We are onboarding @c-thoben and soon you will have your second review 🥳 .

Robaina commented 1 year ago

Hi @Batalex,

thanks again for your review. I noticed that some of your comments were not addressed in the previous PRs. I'll clarify some of them here (preparing a PR for the remaining):

  1. There are a few discrepancies in types, some of them could be fixed by converting argparse.ArgumentParser instances to CommandArgs before using them

I understand what you mean here: Subcommand functions in pynteny.subcommands currently accept both an argparse.ArgumentParser object or a pynteny.utils.CommandArgs object. This flexible type in the argument was made on purpose so subcommand could be called both from the CLI and from within Python. I don't understand why this is undesirable. I could, like you pointed out transform argparse objects to CommandArgs, but find it unnecessary (unless there is a good reason behind that I'm missing).

Additionally, type hints in pynteny.subcommands could be fixed using typing's union

What do you think?

  1. The defaults paths are quite inconvenient if I run the cli from the installed package rather than the source directory, the config file as well as the downloaded database ends up in /home//miniconda3/envs/Pynteny/lib/python3.10/site-packages/

This has been changed already in this PR.

  1. Since the author is using pathlib, they might use / or Path.joinpath instead of os.path.join

This was changed in this PR

  1. I have not profiled the code to see if there is a performance bottleneck there, but I would use itertuples instead of iterrows in addHMMmetaInfoToHits. itertuples is much faster if you need to iterate over data frames one row at a time

This was changed in this PR

(Will get rid of getattr in the next PR)

  1. My personal recommendation is to define the text encoding whenever a text file descriptor is opened. Even if this is not feasible at the moment, this is a tiny step toward Windows compatibility, and it removes a little bit of uncertainty.

Alright, will add that in the new PR.

  1. The MG1655.gb file is stored in the tests folder, which is not included in the package. Therefore, when the package is installed using conda, it is not downloaded.

Right, will modify this in the new PR.

Batalex commented 1 year ago

Thanks, @Robaina; I needed to make a second pass at my comment, but you have already taken care of it!

Aside from the remaining getattr to take care of, I am pleased to see that you found interest in my comments. While we wait for the second review, can I make some PR to add a few quality tools to ease the maintenance in the future?

I am thinking of:

Robaina commented 1 year ago

Sure @Batalex, go ahead with this(these) PR(s). I'll be glad to help. These changes should make the codebase very robust and pretty.

c-thoben commented 1 year ago

Here's my review! @Robaina @arianesasso

Thanks for onboarding me :) I am not that experienced, but I enjoyed looking into the review process. You did some really nice work @Robaina and I hope my comments are helpful.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme file requirements The package meets the readme requirements below:

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.Package structure should follow general community best-practices. In general please consider whether:

Functionality

For packages also submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

*Estimated hours spent reviewing: 10 h


Review Comments

Robaina commented 1 year ago

Thank you so much for your review @c-thoben! I really appreciate the time you have taken to do this review.

Find below my answers to your comments:

  • When trying to install the software, I had some issues with anaconda resolving the package dependencies if i used the recommended command "conda install -c pynteny". I got some help from the community and with "conda create -n pynteny -c conda-forge -c bioconda python pynteny" the installation worked fine. To test Pynteny, I used a VM with Ubuntu 22.04.1 LTS. The anaconda version was 23.1.10 and python3 inside the pynteny environment 3.10.6. If you need furhter information I am happy to provide them.

The recommended way to install pynteny is via conda, in its own environment:

conda create -n pynteny
conda activate pynteny
conda install -c bioconda pynteny

as indicated in the README. It looks like this is what you did at the end. Python should be included in the environment by default.

I have to update the package in bioconda. There have been a lot of changes in pynteny's codebase since then, but did not want to bother the bioconda community with further updates until this review was over.

The new features can be tested installing the developers version, following instructions in the contributing file.

  • I have seen that if the PGAP database is downloaded, it is marked in the configuration file and the download is skipped at future exections. This can lead to problems if e.g. the user has deleted the database locally and wants to redownload it. Maybe a force option can be helpful.

I had the exact same thought. This feature is already included in the newer version of Pynteny (not yet in bioconda).

  • All tests passed for me. However, I have seen that in the scripts "test_database_build.py" and "test_integration_search.py", the unittest main is not called so the tests are not executed if the module is run locally. Is this right?

Right, I missed that. Thanks! It's been changed.

  • I enjoyed the quotes in the CLI :)

I'm glad you enjoyed the quotes :D

c-thoben commented 1 year ago

Hey @Robaina, thanks for your answer! It seems that most of my comments are already resolved. Onto the installation issue. I had the installation problems when I followed the recommended way in the README:

conda create -n pynteny
conda activate pynteny
conda install -c bioconda pynteny

The error I got is due to conda not being able to resolve the dependencies, I am also able to reproduce it. The user who helped me had the same incompatibility on fedora37 shipping python3.11 with conda environments with python 3.10, 3.9 and 3.8. She found out that the installation works if the dependencies are resolved by conda at the same time:

conda create -n pynteny
conda activate pynteny
conda install -c conda-forge -c bioconda python pynteny

If you need more informations I am happy to provide them!

Robaina commented 1 year ago

Hi @c-thoben,

thank you so much for spotting and clarifying this issue. I have added some lines to README to point out this issue: try your solution and also install with mamba instead of conda (which is usually much faster to resolve dependencies). This should be enough for now. However, I'm thinking of further restricting dependency versions in the next update on bioconda, this should also help reduce installation time / solving dependencies.

lwasser commented 1 year ago

@Robaina i'm using a pure mamba forge installation and this install is not working for me on a MAC in a clean environment. something is up with the dependencies that i think needs to be resolved in a way that users can understand.

@c-thoben had a difficult time installing on linux as well and her points need to be addressed!

can you explain a few things to us?

  1. conda install -c conda-forge -c bioconda python pynteny

why do you need to specify both conda-forge and bioconda channels when installing? are there some specific dependencies that are getting in the way?

  1. the instructions in the readme are confusing because there are now 3 installation options as follows: there should be a single way to install a package that works.
Pynteny can be install with the following command:

conda install -c bioconda pynteny
If conda takes a long time to solve the dependencies (e.g., more than a few minutes) or fails to solve it, you may try:

conda install -c conda-forge -c bioconda python pynteny
Or install it with [mamba](https://github.com/mamba-org/mamba) instead of conda:

mamba install -c bioconda pynteny

finally there is a personal channel in your environment here. . we want to ensure that authors are in fact using public channels. can please explain why a personal channel is listed in the environment?

Many thanks. It's critical that you address this installation issue as so many of us are running into confusion installing this package. While i understand the complexity of dealing with specific channels and complex installs, our goal here it so support usability in all of our vetted packages.

Robaina commented 1 year ago

Hi @lwasser,

thanks for your comments.

@Robaina i'm using a pure mamba forge installation and this install is not working for me on a MAC in a clean environment. something is up with the dependencies that i think needs to be resolved in a way that users can understand.

I agree. For what @c-thoben and you tell me, it seems like something is wrong with the package (now outdated) version in bioconda. Pynteny is built and installed without problems in the tests GitHub Action, both in Ubuntu and Mac.

can you explain a few things to us?

  1. conda install -c conda-forge -c bioconda python pynteny

why do you need to specify both conda-forge and bioconda channels when installing? are there some specific dependencies that are getting in the way?

bioconda channel needed for pynteny, conda-forge channel needed for python (solution suggested by @c-thoben). This was meant as a provisional fix until the new version was ready in bioconda. I have no trouble installing pynteny from bioconda in my Ubuntu 20.04 machine, so really could not know that this fix wouldn't work in other settings.

  1. the instructions in the readme are confusing because there are now 3 installation options as follows: there should be a single way to install a package that works.
Pynteny can be install with the following command:

conda install -c bioconda pynteny
If conda takes a long time to solve the dependencies (e.g., more than a few minutes) or fails to solve it, you may try:

conda install -c conda-forge -c bioconda python pynteny
Or install it with [mamba](https://github.com/mamba-org/mamba) instead of conda:

mamba install -c bioconda pynteny

I agree. Ideally only conda install -c bioconda pynteny, perhaps suggesting usage of mamba only to accelerate installation. Will update pynteny on bioconda and try installing it in the GitHub Action (Ubuntu and Mac) directly from bioconda once updated as a way to check installation on MacOS (I don't have access to a Mac machine)

finally there is a personal channel in your environment here. . we want to ensure that authors are in fact using public channels. can please explain why a personal channel is listed in the environment?

Thanks for spotting this. This channel was a remanent of a previous version in which pynteny was hosted in my personal conda channel. It was requested for me to upload pynteny to bioconda prior to review. Already removed.

Many thanks. It's critical that you address this installation issue as so many of us are running into confusion installing this package. While i understand the complexity of dealing with specific channels and complex installs, our goal here it so support usability in all of our vetted packages.

Of course. I totally agree. Thanks for raising these issues. I will update pynteny in bioconda, test installation again, and come back to you all.

Robaina commented 1 year ago

@lwasser

Sorry, I just realized. There is still one dependency, Streamlit-Aggrid, that is hosted in my conda channel. That's why my channel was still listed in the environment.

This is a non-critical dependency, it is employed to make an interactive results table in the web application. While installable from pip, I had to upload it to conda to be able to list it as a dependency in bioconda (pip dependencies not allowed).

Since this dependency is not hosted in a public channel, I will first try to contact Streamlit-Aggrid developer to suggest uploading it to bioconda, if unsuccessful, I will remove this dependency altogether and make a static results table in the web application.

lwasser commented 1 year ago

no worries @Robaina -Many thanks for your prompt reply. i know that this review has been a LOT of work and you've been incredibly committed throughout the process! I just want us to figure out the install items and we can do that together if need be! just to be sure i've pinged a colleague who works on conda-forge to help us sort this out as well. but it ALSO sounds like part of this issue are things you've fixed that haven't been released yet which is very understandable.

i have seen packages (such as geopandas) that did have a dependency that you had to install manually. but i think it's not ideal for users. perhaps there is some way for you to provide instructions for using the interactive approach (if you wish) if you can't find a way to have this on conda.

however, If the package is on pypi already, it's actually quite simple to move it to conda forge using greyskull. please let us know if you need resources to share with the authors. i think this is quite doable and might then be less work for you assuming the authors are ok with it. then every time they deploy to pypi they also merge a pr on conda forge a few hours later. Then it's in both places. i've done this with several packages and it's easy to maintain.

Robaina commented 1 year ago

Thank you for the support, @lwasser. I'm sure we will figure this out.

Alright, so I went to Streamlit-Aggrid's repo and found out that I had already contacted the authors in October 2022 (so many things going on, I totally forgot). Would be nice if we could set the conda-forge pipeline if the author agrees (contacted him again). I let you all know.

lwasser commented 1 year ago

ok awesome! yup, we will figure this out! i'll follow your issue referenced above as well. it looks like the maintainers have a lot of issues and not enough time to address them all which is very understandable and common in this space!

Robaina commented 1 year ago

Hi @lwasser,

I've been thinking about removing the Streamlit web application from pynteny and making it a separate pip-installable package pynteny-app which would contain this functionality. This package would work as an optional add-on to pynteny (which would not be submitted to PyOpenSci)

This change would solve the issue with Streamlit-Aggrid, and probably also help solve dependencies by conda. All the functionality would still be accessible via CLI and within Python as a module (which are the main ways to access pynteny, anyway)

Would this change be acceptable with respect to the status of the current submission?

Thanks!

lwasser commented 1 year ago

@Robaina i think so. but let's check in with @arianesasso and our reviewers @Batalex @c-thoben so they are aware of the update. i see that this dependency is really not core to your functionality so it could be a nice add on as you mention and would streamline your installation significantly! Ari, Alex and Corinna - does this seem reasonable? i don't want to take over this review i just want to help ensure things are installable. many thanks. i can stop complicatin gthe review once we have decided :)

lwasser commented 1 year ago

hey @Robaina it looks like that package may get added to conda-forge by a maintainer. i can keep you posted here if that will save you some time.

lwasser commented 1 year ago

it actually was already there - https://anaconda.org/conda-forge/streamlit-aggrid @Robaina can you confirm that this is what you need? if so it should just be a simple environment update to make everything work.

Robaina commented 1 year ago

Yes, awesome, thank you @lwasser !

Robaina commented 1 year ago

Hi @arianesasso, @Batalex, @c-thoben and @lwasser,

after testing the new packages Streamlit and Streamlit-Aggrid in conda-forge, I am now facing a new issue with Streamlit, basically this one.

While I like the Streamlit app, I would like to prioritize other issues regarding core functionality of pynteny (e.g. adding support for the PFAM database). Therefore, I am now considering the idea previously commented of removing the web interface from pynteny and instead create a new addon pip package containing it.

Please, confirm that this would be an acceptable change to this submission. Here is the branch without the app.

Thanks for all the help :)

Batalex commented 1 year ago

On the one hand, I think that the streamlit app is orthogonal to the package itself. I mean that removing it does not change the package's "elevator pitch." On the other hand, the web app is mentioned in the paper.md (https://github.com/Robaina/Pynteny/blob/main/ms/paper.md) submitted to JOSS.

I would be +1 on moving it to another package, if it means that a larger pool of users can use the core package, but I wonder if it could be an issue with JOSS.

lwasser commented 1 year ago

Thanks @Batalex or maybe just removing it from the JOSS paper or mentioning it as a plugin / add on? it sounds like this won't impact our review at all - i think we can just retest installation once all of this is added on our end. And then on the joss end you can always edit the paper to clarify that component.

Robaina commented 1 year ago

Great, I agree. I will edit the paper. I should be fine since it is not even submitted yet. I'll let you know once I test installation after these new changes.

Robaina commented 1 year ago

Hi there @arianesasso, @c-thoben @Batalex ,

I have updated pynteny on bioconda, and tested installation directly from bioconda within a GitHub Action (under ubuntu-lastest and macos-latest).

I also updated setup instructions in the README, and modified python version, since now pynteny depends on python 3.10.

Please, could you verify that you can install pynteny in your settings? (perhaps @lwasser could also try installing it in MacOS?)

Thanks!

lwasser commented 1 year ago

sure thing i'll try to install on macOS @c-thoben can you please try to install again too following the new readme instructions? @Robaina thank you for your dedication and promptness in getting all of these things sorted out and fixed! it's much appreciated.

Batalex commented 1 year ago

I am not able to install on MacOS (M2)

> mamba create -n pynteny -c bioconda -c conda-forge python=3.10 pynteny

Looking for: ['python=3.10', 'pynteny']

bioconda/osx-arm64                                 129.0 B @ 296.0 B/s  0.4s
bioconda/noarch                                    @   2.3MB/s  2.0s
conda-forge/noarch                                 @   4.2MB/s  3.2s
conda-forge/osx-arm64                              @   1.7MB/s  3.5s
Could not solve for environment specs
Encountered problems while solving:
  - nothing provides requested pynteny

The environment can't be solved, aborting the operation
lwasser commented 1 year ago

i can confirm the same issue and response on my mac!

whats odd is i think the error is it can't find pynteny on bioconda :

PackagesNotFoundError: The following packages are not available from current channels:

  - pynteny

but it is there!

@Robaina do you need help with this? i think we can get you some help. can i invite you to our slack? we can talk there? if so please email me at leah @ pyopensci.org and i'll send you an invite so we can try to figure this out together.

Robaina commented 1 year ago

Sure thing @lwasser, will send you an email for that.

I think the issue might be related to the "new" ARM64 processor architecture in MAC. The machine in which @Batalex tried to install pynteny seems to have one (and all the new apple machines starting November 2020, I think).

Several dependencies of pynteny don't provide support for ARM64, e.g. hmmer, prodigal, pyfastx. Thus, pynteny itself doesn't have support for ARM64 architectures as of now. (To compare, here is another dependency psutil which does have ARM64 support on bioconda).

If that's the case, there is a workaround, basically telling conda to construct an environment using x86 architecture on ARM64 machines to install dependencies not yet supporting ARM64. Here is a blog post with instructions, for what I read, something like:

CONDA_SUBDIR=osx-64 conda create -n pynteny_x86 python=3.10
conda activate pynteny_x86
conda config --env --set subdir osx-64

and then, I guess:

(pynteny_x86) conda install -c bioconda pynteny

That's something new I learned today... I have no way to test if this workaround will be sufficient, though. Will send you the email now, anyway.

Thanks!

arianesasso commented 1 year ago

@Robaina Thank you for going the extra mile and making all those changes!

Batalex commented 1 year ago

The joys of novelty 😬

c-thoben commented 1 year ago

@Robaina I am able to install Pynteny with the README instructions on my Linux VM and also on MacOS without problems. As I have a 2018 MacBook, this may support the ARM64 processor theory. Thanks for looking into this issue @Robaina and thanks for your support @lwasser!

Robaina commented 1 year ago

Awesome! then perhaps I could add that workaround solution to the setup instructions for the people working on ARM64 machines.

lwasser commented 1 year ago

Ok i just installed in my M1 - the sequence below worked!

CONDA_SUBDIR=osx-64 conda create -n pynteny_x86 python=3.10
conda activate pynteny_x86
conda config --env --set subdir osx-64
conda install -c bioconda pynteny

it did take it a moment to import into python but it did import successfully NOTE: i talked with a conda forge maintainer about this - IF those three packages on bioconda could be also added to conda-forge then they can actually be added to a list where a bot tries to build the appropriate arch wheels (arm64 for osx in this case). of course there might be some reason that they aren't already there so this could be a non trivial task. but i think for our purposes if the package installs on everyone's computer who is here in this review, i think we'd done due diligence. and all that is needed is a documentation update for how to install.

Robaina commented 1 year ago

Excellent. Thanks for consulting this with conda-forge @lwasser . I have added these extra lines to README and also the documentation. This issue can always be updated in future releases.

Thanks so much everyone for the review!