mtopaz / NimbleMiner

NimbleMiner: a software that allows users to interact with word embedding to rapidly create lexicons of similar terms, conduct weakly supervised labeling, and implement text mining
GNU General Public License v2.0
20 stars 15 forks source link

RSConnect publishing issues #16

Open rdewald opened 3 years ago

rdewald commented 3 years ago

You have three packages as dependencies not on CRAN, rword2vec, wordVectors, and the package the installation routine builds, NimbleMiner.

After getting application to work in RStudio Pro, I tried to publish to RStudio Connect and most of it went okay until it came to installation of these three packages

[Connect] 2020/11/10 19:38:59.677275289 Installing rword2vec (1.1) ... 
[Connect] 2020/11/10 19:39:00.803693140 curl: (22) The requested URL returned error: 404 
[Connect] 2020/11/10 19:39:00.803710952 curl: HTTP 404 https://cloud.r-project.org/src/contrib/Archive/rword2vec/rword2vec_1.1.tar.gz
[Connect] 2020/11/10 19:39:02.290200350 curl: (22) The requested URL returned error: 404 
[Connect] 2020/11/10 19:39:02.290220197 curl: HTTP 404 https://cloud.r-project.org/src/contrib/Archive/rword2vec/rword2vec_1.1.tar.gz
[Connect] 2020/11/10 19:39:03.345827923 curl: (22) The requested URL returned error: 404 
[Connect] 2020/11/10 19:39:03.345843617 curl: HTTP 404 https://cloud.r-project.org/src/contrib/Archive/rword2vec/rword2vec_1.1.tar.gz
[Connect] 2020/11/10 19:39:04.600536423 curl: (22) The requested URL returned error: 404 
[Connect] 2020/11/10 19:39:04.600553794 curl: HTTP 404 https://cloud.r-project.org/src/contrib/Archive/rword2vec/rword2vec_1.1.tar.gz
[Connect] 2020/11/10 19:39:05.653703616 curl: (22) The requested URL returned error: 404 
[Connect] 2020/11/10 19:39:05.653721657 curl: HTTP 404 https://cloud.r-project.org/src/contrib/Archive/rword2vec/rword2vec_1.1.tar.gz
[Connect] 2020/11/10 19:39:06.657899821 FAILED
[Connect] 2020/11/10 19:39:06.659403225 Error in getSourceForPkgRecord(pkgRecord, srcDir(project), availablePackagesSource(repos = repos), : Failed to retrieve package sources for rword2vec 1.1 from CRAN (internet connectivity issue?)
[Connect] 2020/11/10 19:39:06.659443313 
[Connect] 2020/11/10 19:39:06.659495861 Unable to fully restore the R packages associated with this deployment.
[Connect] 2020/11/10 19:39:06.659501123 Please review the preceding messages to determine which package
[Connect] 2020/11/10 19:39:06.659537844 encountered installation difficulty and the cause of the failure.
[Connect] 2020/11/10 19:39:06.659725271 Error code: r-package-not-available

[Connect] An R package required by the content cannot be found in the package repository.

[Connect] Possible causes:
[Connect] * The R package being installed is not available for the version of R configured at Connect.
[Connect] * The R package being installed is not available for linux.
[Connect] * The client computer that published the content is using a different package repository from the Connect server, and the R package being installed is not available in the repository configured at the Connect server.
[Connect] * The package repository moved to a new URL after the content was published, and Connect is now attempting to rebuild the environment using the old package repository URL.
[Connect] 
[Connect] Possible solutions:
[Connect] * Install a version of R on the Connect server that matches the one being used by the client computer. You can identify the required R version from the deployment logs, which will contain an entry similar to this:
  > Bundle requested R version 3.5.0; using /usr/lib/R/bin/R which has version 3.4.4
  In this example, the client computer has R 3.5.0 and the Connect server has R version 3.4.4. The recommended solution would be to install R version 3.5.0 on the Connect server alonside the existing 3.4.4 installation.
[Connect] * Replace usage of Windows-specific R packages with ones available for linux.
[Connect] * Configure the client with a package repository that is accessible via http(s) from the Connect server, such as CRAN or RStudio Package Manager. Reinstall the affected packages from the new repository and publish the content again.
[Connect] 
[Connect] References:
[Connect] * https://docs.rstudio.com/connect/admin/getting-started/#installation
[Connect] * https://support.rstudio.com/hc/en-us/articles/360004067074-Managing-Packages-with-RStudio
[Connect] 
[Connect] 2020/11/10 19:39:06.659725271 Additional data:
[Connect] Repository: 'CRAN'
[Connect] Package: 'rword2vec'
[Connect] PackageVersion: '1.1'
[Connect] Found in the following log entry:
[Connect] 2020/11/10 19:39:06.659403225 Error in getSourceForPkgRecord(pkgRecord, srcDir(project), availablePackagesSource(repos = repos), : Failed to retrieve package sources for rword2vec 1.1 from CRAN (internet connectivity issue?)
[Connect] Build error: An R package required by the content cannot be found in the package repository. (r-package-not-available)
Document deployment failed with error: An R package required by the content cannot be found in the package repository. (r-package-not-available)
Warning messages:
1: The vignette title specified in \VignetteIndexEntry{} is different from the title in the YAML metadata. The former is "Vignette Title", and the latter is "Word2Vec Workshop". If that is intentional, you may set options(rmarkdown.html_vignette.check_title = FALSE) to suppress this check. 
2: The vignette title specified in \VignetteIndexEntry{} is different from the title in the YAML metadata. The former is "Vignette Title", and the latter is "Word2Vec introduction". If that is intentional, you may set options(rmarkdown.html_vignette.check_title = FALSE) to suppress this check. 
3: In FUN(X[[i]], ...) :
  Package 'NimbleMiner 0.1.0' was installed from sources; Packrat will assume this package is available from a CRAN-like repository during future restores
4: In FUN(X[[i]], ...) :
  Package 'rword2vec 1.1' was installed from sources; Packrat will assume this package is available from a CRAN-like repository during future restores
5: In FUN(X[[i]], ...) :
  Package 'wordVectors 2.0' was installed from sources; Packrat will assume this package is available from a CRAN-like repository during future restores

We have RStudio Package Manager available, I will be investigating a fix by setting up a local repo that is accessible to RSConnect, but I thought you might want to know that it appears this will be an issue to address if you want to publish this app to RSConnect.

I will update when I make progress.

mtopaz commented 3 years ago

Perfect- thanks so much for fixing this Richard! See if I can be of any help. Also, let me know if you are getting stuck with any specific packages- I can help find them and share with you.

rdewald commented 3 years ago

If we fix this by adding a local repo it would be helpful to hack up a VNSNY-specific version (we have it forked) of the installation R script so that it fetches the packages from our local repo instead of building them from devtools and github.

mtopaz commented 3 years ago

Sounds like a good solution to me Richard!

rdewald commented 3 years ago

In order to separate the RS Connect issue from #17 I went back another server and verified the app loads and runs as expected in R 3.6.3, which is where I originally deployed it, and what we were using during the demo to the VNSNY BIA group.

However, when I try to publish, I am met with this error, which is not related to the problems your are having in https://github.com/vnsny-bia/server-datasci.vnsny.org/issues/35 by the way. If you weren't having that problem, you would have had this one next:

This is what gets deployed. The read me from one of the github dependencies. You'll need to be behind the VPN to see that. Since this is a public repo, allow me:

image

Doh!

The problem is that rsconnect is looking for a deployable document, I think you might need to make an RMarkdown version of your app.

It works just fine in RStudio Pro, I will be communicating in a separate channel about the way forward with the research project.

rdewald commented 3 years ago

I have addressed the issues with package management for the shiny app with vnsny-bia/nimbleMine#1

rword2vec and wordVectors are loaded using

devtools::install_github("mukul13/rword2vec")
devtools::install_github("bmschmidt/wordVectors")

The functions that used to be loaded into the local build of the NimbleMiner package now are available in a local repo on datasci. This repo is added via:

options(repos = c("https://cloud.r-project.org/", "https://datasci.vnsny.org/r-pkgs/"))

Because of this success, our fork vnsny-bia/NimbleMiner has had the installer trimmed.

@mtopaz I haven't sent a PR because when I deploy in RStudio-connect, this error is generated:

nimbleMiner-RSConnect

Any ideas?

mtopaz commented 3 years ago

Thanks Richard! It looks to me like the issue is with a package called "NLP" and some issues with tensorflow. Would installing these packages help? https://cran.r-project.org/web/packages/tensorflow/index.html and https://cran.r-project.org/web/packages/NLP/index.html I don't believe that NimbleMiner uses either of these packages so it is a bit weird.

On Tue, Jan 26, 2021 at 7:25 PM Richard DeWald notifications@github.com wrote:

I have addressed the issues with package management for the shiny app with vnsny-bia/nimbleMine#1 https://github.com/vnsny-bia/nimbleMine/issues/1

rword2vec and wordVectors are loaded using

devtools::install_github("mukul13/rword2vec") devtools::install_github("bmschmidt/wordVectors")

The functions that used to be loaded into the local build of the NimbleMiner package now are available in a local repo on datasci https://datasci.vnsny.org/r-pkgs/. This repo is added via:

options(repos = c("https://cloud.r-project.org/", "https://datasci.vnsny.org/r-pkgs/"))

Because of this success, our fork vnsny-bia/NimbleMiner https://github.com/vnsny-bia/NimbleMiner has had the installer trimmed.

@mtopaz https://github.com/mtopaz I haven't sent a PR because when I deploy in RStudio-connect, this error is generated:

[image: nimbleMiner-RSConnect] https://user-images.githubusercontent.com/1530216/105923358-434d2200-600a-11eb-8bc8-388ce9f8215a.jpg

Any ideas?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mtopaz/NimbleMiner/issues/16#issuecomment-767918910, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3EDCJ5ZFGESMGXQM43RG3S35MN3ANCNFSM4TRCVEFQ .

--

Maxim (Max) Topaz PhD, RN, MA Elizabeth Standish Gill Associate Professor of Nursing

Columbia University Medical Center

Columbia University Data Science Institute

Visiting Nurse Service of New York

Harvard Medical School & Brigham and Women's Hospital

Website: http://nursing.columbia.edu/profile/mtopaz http://nursing.columbia.edu/profile/mtopaz

rdewald commented 3 years ago

Thanks Richard! It looks to me like the issue is with a package called "NLP" and some issues with tensorflow. Would installing these packages help? https://cran.r-project.org/web/packages/tensorflow/index.html and https://cran.r-project.org/web/packages/NLP/index.html I don't believe that NimbleMiner uses either of these packages so it is a bit weird.

Right, I think the issues is related to a backend package called reticulate that mediates the integration with python, which is likely a dependency in almost any machine learning environment. Deeper than that, the issue is probably with our python install on that server, which has it's own issues.

We are working on a cloud instance of RStudio-connect. Once we start testing that I'll publish NimbleMiner to that location and report back with I learn. In the meantime, your researchers have something to use anyway.

mtopaz commented 3 years ago

Yes! Excellent- thanks

On Wed, Jan 27, 2021 at 11:09 AM Richard DeWald notifications@github.com wrote:

Thanks Richard! It looks to me like the issue is with a package called "NLP" and some issues with tensorflow. Would installing these packages help? https://cran.r-project.org/web/packages/tensorflow/index.html and https://cran.r-project.org/web/packages/NLP/index.html I don't believe that NimbleMiner uses either of these packages so it is a bit weird.

Right, I think the issues is related to a backend package called reticulate that mediates the integration with python, which is likely a dependency in almost any machine learning environment. Deeper than that, the issue is probably with our python install on that server, which has it's own issues.

We are working on a cloud instance of RStudio-connect. Once we start testing that I'll publish NimbleMiner to that location and report back with I learn. In the meantime, your researchers have something to use anyway.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mtopaz/NimbleMiner/issues/16#issuecomment-768392535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3EDCPN62MYSKKQRLW3I2DS4A3CZANCNFSM4TRCVEFQ .

--

Maxim (Max) Topaz PhD, RN, MA Elizabeth Standish Gill Associate Professor of Nursing

Columbia University Medical Center

Columbia University Data Science Institute

Visiting Nurse Service of New York

Harvard Medical School & Brigham and Women's Hospital

Website: http://nursing.columbia.edu/profile/mtopaz http://nursing.columbia.edu/profile/mtopaz