Closed ablack3 closed 6 months ago
The size is too big (> 5MB). I'd like it to fit into cran, but there was a feature that was thrown into the JSON serialization that calculates the CDM compatability using a javascript library. I tried to resist this feature but it was pushed in, but I'd like to remove it (I haven't done analysis on the impact). Because the semver compatibility logic is a javascript module, we bundle a javascript execution engine with circe (actually with StandardAnalysisUtils, which circe depends on) which turns out to be 47MB (which is 42MB too many MB and that's just a dependency).
It would be so nice to have a minified version of Circe available on CRAN since it is a core part of the OHDSI stack. Would it be a bad idea to copy the java source code to CirceR and create a minified version with a smaller feature set - primarily transforming circe json to OHDSI SQL.
I think cran requires that the java source code is in the package.
@ablack3 @edward-burn
We have managed to reduce the size of circe-be to approx 4.5MB and Chris has recently created a new release, so principally this obstacle should be removed from the 'road to cran' for CirceR.
I'm reviving this issues so you guys can get involved with Chris in whatever is necessary to make this happen :-)
Awesome @RowanErasmus!
Ok, I've published a release that has the changes to reduce the package size. I'm not sure where to begin to try to get this into cran (or even if the changes have made the overall pacakage size < 5MB for cran).
If someone wants to take the ball and see about how that could be set up, that would be most helpful.
Well Cran packages that contain compiled java code need to also contain the java source code as far as I know. So that is a start. Would you like me to try and submit it to CRAN? Basically it involves
devtools::submit_cran()
Ok, sounds good, just to see what issues they are going to have. I don't think there's a ton of source code (that's an odd requirement) but we're so close to the 5MB mark with just the dependencies alone, I worry they are going to trip that restriction. But it's worth a shot. Thanks, @ablack3 .
Be sure to fix these notes before submitting to CRAN:
Found if() conditions comparing class() to string:
File ‘CirceR/R/ConceptSetSqlBuilder.R’: if (class(conceptSetJSON) == "character" && length(conceptSetJSON) == 1 && nchar(conceptSetJSON) > 0) ...
File ‘CirceR/R/PrintFriendly.R’: if (class(conceptSetList[[1]]) == "jobjRef") ...
Use inherits() (or maybe is()) instead.
and add a cran-comments.md
file in the root (see here for an example)
I use this for submitting to CRAN. Not sure what the difference with devtools::submit_cran()
is:
devtools::check_win_devel()
devtools::check_rhub()
devtools::release()
Just remembered: please change the system requirements in DESCRIPTION to:
SystemRequirements: Java (>= 8)
(Almost got kicked out of CRAN for that one)
Thank you for the heads up @schuemie!
Hi! Is there any update regarding CirceR on CRAN? Thanks :)
Ok, I'll get on this...I think recent updates to circe has reduced the size (thanks @ablack3 ) and it might be small enough now.
So, I have my doubts about this getting done. I ran through various checks locally and I get this with the internal (RStudio) Check:
* using log directory 'C:/Git/CirceR.Rcheck'
* using R version 4.1.2 (2021-11-01)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: ISO8859-1
* checking for file 'CirceR/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'CirceR' version '1.3.3'
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package 'CirceR' can be installed ... ERROR
Installation failed.
See 'C:/Git/CirceR.Rcheck/00install.out' for details.
* DONE
Status: 1 ERROR
The log file gives no additional details. The build works fine, tho, and installing package from remotes::install_github()
works fine.
Why do we want this to go into CRAN, again?
Ah, found additional log files:
* installing *source* package 'CirceR' ...
** using non-staged installation
** R
** data
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - i386
Error: package or namespace load failed for 'CirceR':
.onLoad failed in loadNamespace() for 'CirceR', details:
call: NULL
error: .onLoad failed in loadNamespace() for 'rJava', details:
call: fun(libname, pkgname)
error: No CurrentVersion entry in Software/JavaSoft registry! Try re-installing Java and make sure R and Java have matching architectures.
Error: loading failed
Execution halted
*** arch - x64
ERROR: loading failed for 'i386'
* removing 'C:/Git/CirceR.Rcheck/CirceR'
So, I don't (and won't) have the JDK for i386, is there a way to indicate that this package isn't for i386? or instead declare it 'multi-arch'? I notice I usually have to pass --no-multi-arch
in my install commands like:
remotes::install_github('ohdsi/DatabaseConnector', INSTALL_opts=c("--no-multiarch"))
In R-Studio, you can set the R check options (Build --> More --> Configure build tools ...), where I believe you can add --no-multiarch to the Check options.
But perhaps you can re-install R without the 32-bit version? R 4.2.3 (the HADES official supported version) does not include the 32-bit version by default.
@chrisknoll Answering your question from above:
Why do we want this to go into CRAN, again?
We have some collaborators working in a Secure Data Environment who unfortunately cannot install packages from GitHub, only from CRAN, into their environments.
But perhaps you can re-install R without the 32-bit version? R 4.2.3 (the HADES official supported version) does not include the 32-bit version by default.
Now there's a reason for going to that version hahah. Ok I'll go with that.
Ok, #43 is created, waiting for GitActions to show everything passes.
Once that's done, I believe the steps will be to create a new release into master, and then I'd submit to cran from master branch.
That's fantastic!! :D
Looks like we're all green on the check side. I also addressed an issue raised by git actions that Node 16 is deprecated and we should be using the Node 20 based actions, so you can change your actions/checkout@v2
to actions/checkout@v4
and actions/cache@v2
to actions/cache@v4
.
Since I believe these action scripts can be shared by hosting in a repo (example: https://github.com/actions/cache/blob/v3/action.yml contains the script of the cache action for v3), maybe there's something we can do for HADES based actions like RCMD checks so that if we want to update a version of another action, we don't have to go through all of our repos to update the gitaction scripts.
Great work!
I would love it if we can share the same GitHub Action scripts across HADES. If anyone wants to explore that option it would be greatly appreciated.
@schuemie , it looks like you started researching this (I was going to create an 'actions' repo but it already existed.
If you don't object, I can migrate the RCMDCheck workflow into the shared repo (you started it, but it's 2 years old now), and I think they suggest a specific directory to host workflows (even shared ones)....which is .github/workflows/{workflowfile}.yaml
, but honestly, i don't see why we wouldn't host it in a YAML right in the root as you did here.
This is the guide I'd follow to make it reusable: https://resources.github.com/learn/pathways/automation/intermediate/create-reusable-workflows-in-github-actions/
Update: As of 4/3/2024 link is broken, hopefully it's just server issue.
Yes, I should have mentioned that I started working on that, but then got stuck. If I remember correctly, the problem I didn't know how to solve was how to include the secrets / environmental variables in the shared script.
Please feel free to change everything in that repo.
Just circling back, i'm going to focus first on cran submission and then circle back to make workflows reusable.
This has been accepted in CRAN. Thanks to everyone for the assistance.
@chrisknoll, Is there any hope of getting CirceR on CRAN?
tagging @edward-burn