statdivlab / corncob

Count Regression for Correlated Observations with the Beta-binomial
102 stars 22 forks source link

Get! Corncob! back! on! CRAN! (Also, installation workaround code) #157

Closed natalie0lson closed 8 months ago

natalie0lson commented 10 months ago

Wanted to share how I was able to pull corncob (and deprecated dependencies including optimr) from archived files to save anyone else the trouble (I'm sure there are more efficient ways to do this but this worked for me.)

install.packages(c("optextras", "setRNG"))

install.packages("mypath/Rcgmin_2022-4.30.tar.gz", repos = NULL, type="source")

install.packages("mypath/Rvmmin_2018-4.17.tar.gz", repos = NULL, type="source")

install.packages("mypath/optimr_2019-12.16.tar.gz", repos = NULL, type="source")

install.packages("mypath/corncob_0.3.2.tar.gz", repos = NULL, type="source")

adw96 commented 10 months ago

Thanks so much for this, @natalie0lson !

Were you trying to access an old version of corncob? If so, we could make the versions available as tarballs to help other folks. Let us know if this would have been helpful.

Basically, I'd love to know what you were trying to do (and why) so we can make this easier for future users.

natalie0lson commented 10 months ago

Hi Amy! It looks like the package "optimr" was archived (in favor of "optimx" which is now available via CRAN) which caused corncob to also be archived as of 10-29-2023 (only a few days ago). Was able to easily access the archived tarballs for corncob but due to its dependency on "optimr" (which also seems to depend on archived packages "Rcgmin" and "Rvmmin") the process got a little complicated so I figured I'd save anyone else the headache.

But if you've got time on your hands to switch the corncob dependency on optimr to optimx I think that might solve the problem.

adw96 commented 10 months ago

Thanks so much for letting us know, @natalie0lson ! This isn't the first time CRAN has removed corncob without telling us 😿

@svteichman Can I please ask your help with investigating optimx and transitioning our code if possible? 🙏

bryandmartin commented 10 months ago

FWIW - optimr is/was only used in the package as an alternative to trust region optimization in bbdml if it was numerically unstable for some reason (default is still trust). The only place it appears is line 251 of bbdml.R (and in the corresponding documentation) for optimization. I haven't checked the syntax for optimx, but my guess is the single call to optimr::optimr can be pretty easily transitioned.

svteichman commented 10 months ago

I'll look into this. Thanks @bryandmartin for the tip!

adw96 commented 10 months ago

Agreed, thanks so much @bryandmartin . Any thoughts on if it should it be a "Suggests" not an "Imports"? (to reduce our reliance on changing dependencies)

bryandmartin commented 10 months ago

That makes sense to me @adw96. It would prevent something like this in the future. The only con I can think of is if someone wanted to use optim optimization and didn't already have the package, I think it would throw an error.

svteichman commented 10 months ago

From the vignettes of optimr and optimx it seems that originally optimr was going to be the version on cran that was relatively stable and optimx was going to be a version of the same package available on alternative package repositories and would allow more optimization methods (and the author expected that it would occasionally glitch due to that). However, it seems that optimr was recently removed from cran and optimx was either updated or added around the same time, so I suspect that optimx is now the stable version of the package.

I could put optimx as a Suggests and then in the code if the user chooses the optim optimization have it check to see if optimx is loaded and if not throw an error that says that if they want to use optim they need to install and load that library.

svteichman commented 10 months ago

For context, optimr and optimx both have the function optimr() being used in bbdml() and give the same results in the examples I've run (although with slightly different return item names).

adw96 commented 10 months ago

This just occurred to me -- in the setting where a current user has optimr installed (but not optimx), is it possible for bbdml to call the installed package instead? (It would be great to avoid errors for people who had it working previously! )

svteichman commented 10 months ago

I believe so, I'll have bbdml() use either optimx or optimr if either are installed, and if not throw an error suggesting that the user installs optimx.

adw96 commented 10 months ago

Just edited to remove the requireNamespace(optimr) and associated "no visible global function definition for ‘optimr’" NOTEs and use optimr::optimr instead -- see commit d02ac1c

@svteichman could you also please talk me through renaming this to main today? 🙏

adw96 commented 10 months ago

now triggering a warning, sigh 😿 I'm not sure what the solution is. We do need a binding for optimr or we'll get NOTEs (preventing CRAN submission)... Can you think of a way to to get rid of both warnings and notes, @svteichman ? 🙏

svteichman commented 10 months ago

I can talk you through it in our meeting - I don't think so, so I went with the note over the warning, but we may want to remove the use of optimr to avoid all notes and warnings. I'll keep thinking about a different workaround though.

On Tue, Nov 7, 2023 at 9:25 AM AmyW @.***> wrote:

now triggering a warning, sigh 😿 I'm not sure what the solution is. We do need a binding for optimr or we'll get NOTEs (preventing CRAN submission)... Can you think of a way to to get rid of both warnings and notes, @svteichman https://urldefense.com/v3/__https://github.com/svteichman__;!!K-Hz7m0Vt54!kG-4B8RSlfkcqvEvBn_PMYETEN59B9PXoQ-EROB8oCCJJqu2q0JJv3R85piCCc0yVYI_DHIXryPaDizPkgvKeQ$ ? 🙏

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/statdivlab/corncob/issues/157*issuecomment-1799286512__;Iw!!K-Hz7m0Vt54!kG-4B8RSlfkcqvEvBn_PMYETEN59B9PXoQ-EROB8oCCJJqu2q0JJv3R85piCCc0yVYI_DHIXryPaDiwFvZ0l0Q$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AEDVQ7KW5F4NCHBGIG7OSWTYDJVH3AVCNFSM6AAAAAA6ZVUCFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJZGI4DMNJRGI__;!!K-Hz7m0Vt54!kG-4B8RSlfkcqvEvBn_PMYETEN59B9PXoQ-EROB8oCCJJqu2q0JJv3R85piCCc0yVYI_DHIXryPaDiycyaAARQ$ . You are receiving this because you were mentioned.Message ID: @.***>

svteichman commented 10 months ago

The problem that I see is that as soon as we add optimr::optimr() then we get a warning because optimr is not declared in the DESCRIPTION under imports or suggests, but then when I try adding it to suggests I get an error because it is not currently available.

adw96 commented 10 months ago

@bryandmartin Did you ever come across situations where trust optimization failed and optimr converged? We are thinking of deprecating the optimx/optimr options (for ease of maintenance + trust works well almost always!) and would love your perspective.

Another option is to use stats::optim -- do you have thoughts on this?

bryandmartin commented 10 months ago

Honestly, I don't remember the answer to whether I had fails in one but not the other. I think optimr can be safely deprecated, and optimx probably can too. However, I also think you could solve the optimr issue by changing optimr::optimr to optimx::optimr. The warning comes because you are searching through a namespace that the package doesn't know about. You might be able to fix both by just changing any instance of optimr::[fn] to optimx::[fn].

In any case, I think deprecating both would come at no major cost, besides giving users one less option.

adw96 commented 9 months ago

Update -- the amazing @svteichman has resubmitted us to CRAN! 🥳 Thanks, Sarah!

svteichman commented 9 months ago

It turns out that a major barrier to resubmitting to CRAN is that the phyloseq package is not available for r-release-macos-arm64 and r-release-macos-x86_64, both of which are platforms checked by CRAN for package submission. Therefore, we would need an alternative way to install phyloseq on these platforms or to switch phyloseq from an "Imports" to a "Suggests" in the package. This would require a large overhaul of the codebase, because most of the functions expect that data is either given as a phyloseq object, or as a data frame that is then internally transformed into a phyloseq object.

For now, we will not attempt to resubmit corncob to CRAN and will instead have it available for download from github using remotes::install_github("statdivlab/corncob"). If anyone has a strong desire for corncob to be on CRAN, feel free to let us know and reopen this issue.

bryandmartin commented 9 months ago

For potential future reference: bbdml actually converts phyloseq objects into data frames for analysis. differentialTest does convert data frames and matrices into phyloseq objects, but then it converts the data back to a data frame object within the for loop for the actual analysis. This was probably an unnecessary double conversion, mostly done because it made it easy to subset data by taxon in a consistent way (Line 128). Lines 72-113 of differentialTest.R and the corresponding section of the contrasts function are the major barriers here. For functional purposes all that chunk of code is really doing is finding the number of parameters being estimated for both DA/DV testing, and recording taxon names if they are there.

There are also a few functions written specifically for cleaning up phyloseq objects in some way (convert_phylo, clean_taxa_names, etc). These would actually be fairly easy to convert because you would only need to add a check for the library at the beginning of the function and just stop with error explaining how to download it if it's not already there. You could even do that for differentialTest if you wanted to avoid any extensive overhaul.

Maybe it's a moot issue if you aren't planning on putting it back on CRAN, but that's my two cents if you do decide to change this later.

xec-cm commented 9 months ago

Hi @svteichman

I’ve noticed a comment regarding the challenge of resubmitting to CRAN due to the unavailability of the phyloseq package for r-release-macos-arm64 and r-release-macos-x86_64. I’ve found that by forking and modifying the GitHub action with usethis::use_github_action(), I’ve been able to successfully install phyloseq from Bioconductor on macOS you can see the successful run at https://github.com/xec-cm/corncob/actions/runs/6966602478/job/18957030718).

image

I hope this addresses the issue, as I’m currently in review with Bioconductor for a package that partially depends on yours, and not having it on CRAN could pose a problem for me and all other packages that depend on yours in Bioconductor.

Another option could be to migrate from phyloseq to TreeSummarizedExperiment using the mia package. If you’re interested in this approach, I’d be more than happy to lend a hand. Looking forward to hearing your thoughts.

svteichman commented 9 months ago

Hi @xec-cm,

Thanks for letting us know about this! I'll check out your fork of the Github actions file and we'll try our best to figure out this phyloseq dependency to get corncob back on CRAN.

Sarah

adw96 commented 9 months ago

Wow, thank you so much @xec-cm! This is hugely helpful as we navigate this challenge. Many thanks for your comment.

svteichman commented 8 months ago

Closing this issue because corncob is back on CRAN! Woohoo!

@bryandmartin Thanks for your ideas about reducing the reliance on phyloseq, it is now a Suggests instead of an Imports.

@xec-cm Tagging you to let you know. Thanks for your comments and ideas on this issue!

adw96 commented 8 months ago

A huge shout out to @svteichman for creative solutions to make this happen. A lot of work went on behind the scenes to make this happen. Well done and thank you, Sarah!