Closed Laurae2 closed 3 years ago
@Laurae2
@guolinke CRAN Windows machines do not have MSBuild or Runtimes for Visual Studio compiled code. They use MinGW 4.9 and Rtools to compile code and create packages.
If using precompiled stuff, we have to check with them about how we can get the package accepted on CRAN. CRAN usually compiles packages themselves in an automated fashion and there should be no error/warning/note when running the check on all platforms they have (unless we have a special reason for the check). The compiled package by CRAN is what is proposed to R users as direct download. As all CRAN / R users (who compile stuff like for Rcpp) have Rtools, they forcibly have MinGW 4.9 hence why the requirement to switch to MinGW as default: VS compiled code will not run nor load on CRAN machines (DLL error), leading to the package being rejected.
@Laurae2 It can run by copy vc runtime DLL to the same folder of lib_lightgbm.dll manually
@guolinke vc runtime DLL is MIT license? Everything in the R package must be MIT license if the R package is MIT in CRAN. There are no exception for this rule in CRAN.
This is the rule for using binary packages for Windows / Mac:
Binary packages
Policies for when a (Windows or OS X) binary package will be distributed:
- all its package dependencies on CRAN are available for that platform. Dependencies from other repositories will be installed at CRAN’s discretion.
- any external software needed can easily be installed on the build machine for all the sub-architectures: here “easily” includes not depending on specific versions, nor should the installed binary depend on specific versions.
- it passes R CMD check without error for all the available sub-architectures, or at CRAN’s discretion, for the most important sub-architecture(s).
Binary packages are not accepted from maintainers: CRAN will only host binary packages prepared by those responsible for the binary areas. Their packages are made automatically by batch jobs and can take a day or two to appear on the CRAN master site (maybe longer to reach CRAN mirrors).
Binary packages are built for the current version of R: they may also be built for the last version in the previous series (e.g. R 3.1.3 when R 3.2.x is current) or for R-devel.
Questions about binary packages should be addressed to those responsible for building them: Simon Urbanek (OS X) and Uwe Ligges (Windows); email addresses ‘First.Lastname@R-project.org’.
You can also check here online to try to pass a fake Windows CRAN check: https://win-builder.r-project.org/
@Laurae2
all its package dependencies on CRAN are available for that platform. Dependencies from other repositories will be installed at CRAN’s discretion.
So the dll of vc runtime is ok ?
Another solution: upload our dlls (include vc runtime) to another website (like nuget), and download it when install R-package ?
@guolinke We need to ask uwe.ligges@r-project.org to see if it is feasible or not to use Visual Studio DLL requiring VC runtime.
The whole installation and the DLL compilation must be done by Uwe Ligges.
Hello, Good to see that you are close to release it to CRAN. Thanks for that!
I am going to start with the vignette.
Is there any point you want to highlight?. And regarding the time frame for this, when should it be ready?.
Thanks! Carlos.
@Laurae2 okay, can you help to ask Uwe Ligges ?
@guolinke yes, but I'm currently on a project so I don't have enough time anymore.
Perhaps @dselivanov can?
I doubt cran will accept anything which requires vc or something not open sourced. I'm out of context, could you point me why vc is used instead of mingw?
@dselivanov VS is used by default for performance. But now we have an automatic fallback to Rtools' MinGW, so it is fine.
@Laurae2 can we enable the test of R package in travis / appveyor?
@guolinke Yes, we also need to add more appropriate tests to testthat folder (it's very outdated, from 6 months ago).
https://github.com/Microsoft/LightGBM/tree/master/R-package/tests/testthat
how to install lightgbm in the R-language?
@Royhuiy read https://github.com/Microsoft/LightGBM/tree/master/R-package and open a new issue if you still are not managing to install the R package. If everything is setup correctly, then it requires only one line to install the R package.
@Laurae2 any updates or plans for CRAN ?
@guolinke no update yet, we would need to clean all the check errors/warnings/notes first
@Laurae2 any updates ?
@guolinke Currently not having enough time, perhaps @jameslamb can help us.
@Laurae2 @guolinke I have been traveling a lot recently, but back with my feet on the ground for a while and should have some time to help with this. Do you have a target date for CRAN release?
@Laurae2 I've fixed a few more of these today, will submit a PR in the next few days.
You can assign this issue to me if you'd like
I made a ton of progress on this yesterday. I have the R CMD CHECK
stuff down to a single note that I don't know how to best tackle. Details coming in a PR tomorrow.
Can you explain more about these two other tasks?
For the first one, it is not a CRAN requirement that the package be at repo root in source control. feather
, for example, has a really similar structure to LightGBM where they copy shared C code at build time into both Python and R code. [CRAN](https://cran.r-project.org/web/packages/feather/index.html | Github
I don't understand what you mean by number 2.
@jameslamb Trying to assign you the issue, it seems I can't add you (GitHub bug?).
Make examples runnable for tests
Here is an example: https://github.com/Microsoft/LightGBM/blob/master/R-package/R/lgb.train.R#L38 (currently, all examples are dontrun
)
Setup a LightGBM special version for CRAN (R-package must be root folder, not a child folder)
If this is not needed, then this requirement can be removed.
I made a ton of progress on this yesterday. I have the R CMD CHECK stuff down to a single note that I don't know how to best tackle. Details coming in a PR tomorrow.
Great! Waiting tomorrow to see it!
Do you have a target date for CRAN release?
ping @guolinke (I guess none currently?)
@Laurae2 @guolinke @randxie I just added #1499 😀 . I expect there to be a lot of questions on it...looking forward to the discussion!
Yes let's remove the requirement about the "special version" repo. It's not required.
@randxie and I can tackle the examples dontrun
thing at some point soon.
re: that final note left on #1499 , @ntdef pointed me to this: https://github.com/dmlc/xgboost/commit/1495a43cea05071e9451d008addffa598e4e7341
Apparently xgboost
team had the same problem with CRAN and printf()
. That gives me hope! Can address it in a follow-up PR
@Laurae2 can you update the checklist at the top of this? As of #1499 and given #1626 that I just created, I think it should look like this:
[not necessary] Setup a LightGBM special version for CRAN (R-package must be root folder, not a child folder) [done] Fix CRAN errors [done] Fix CRAN warnings Fix CRAN notes [done] Add vignettes [done] Make examples runnable for tests Switch to MinGW by default on Windows (it falls to MinGW if VS fails to be found) [done] Clean install everytime when not using precompiled dll/lib Pass CRAN checks on Windows Pass CRAN checks on Linux Submit to CRAN Get accepted on CRAN
@jameslamb updated the checklist
thank you!
Any more recent updates regarding this? Is the checklist above up to date? Perhaps I can help get this package CRAN ready.
Hey @cdeterman apologies for the delay! We have one R CMD CHECK "NOTE" remaining on Linux distributions, building from source. That is the next highest-priority item on our road to CRAN.
I will write it up as a separate issue and link it here.
@cdeterman see the above, let me know if you have questions!
@Laurae2 @StrikerRUS I've created label r-package
and am going to go through and tag outstanding R issues with it so others can see what is in the R backlog more easily than via text search on issue titles.
@guolinke @jameslamb Rinternals.h is LGPL v2.1 (or any later version) so we are free to hook onto it (with certain limitations, like not copy&pasting a whole block of Rinternals.h onto LightGBM) while keeping MIT license of LightGBM and LightGBM R-package.
https://github.com/wch/r-source/blob/tags/R-3-5-3/src/include/Rinternals.h#L7
@Laurae2 good investigation, that is exciting! I see it's the same for R.h
(https://github.com/wch/r-source/blob/tags/R-3-5-3/src/include/R.h#L6) which we may also need. If we are ok to use these headers, than I think I know how to solve #1910 and I also think it would give us a path to solve #1909 .
I've assigned #1910 to myself as I already have a branch where I'd started working down this path. Will put up a PR when I can to address it.
@jameslamb Any updates?
Hey @StrikerRUS sorry for the (very long) delay answering here! For those following this issue, we have at least two lingering issues to address to get to CRAN:
I just introduced #2837 which is a step towards that. It adds the idea of R defining an option for cmake
which says "hey you are compiling lig_lightgbm
specifically for the R package". To finish #1910 and #1909 I think I'll need to replace our custom handling of the R-to-C interface (https://github.com/microsoft/LightGBM/blob/master/include/LightGBM/R_object_helper.h) with the headers that ship with R. I tried to just #include <R.h>
to get e.g Rprintf()
, but that led to a lot of name conflicts.
So I'm attacking this in the following order:
lib_lightgbm
specific to the R package (#2837)<R.h>
and <Rinternals.h>
to address #1909 R CMD INSTALL lightgbm_*.tar.gz
) on different operating systems@jameslamb Thanks for the heads-up!
I think one more step (related to n. 4 in your list) is to enable Windows R-package job (#2335) that can be done in the nearest future.
@jameslamb Thanks for the heads-up!
I think one more step (related to n. 4 in your list) is to enable Windows R-package job (#2335) that can be done in the nearest future.
great point! Yes I agree, that will help. I do have an old-ish Windows laptop. Maybe I'll spin that up and start developing on it 😀
@jameslamb
I do have an old-ish Windows laptop. Maybe I'll spin that up and start developing on it 😀
Appveyor has a great functionality that allows to connect to it via RDP: https://www.appveyor.com/docs/how-to/rdp-to-build-worker/. It helped me many times! You can use it to setup R package pipeline. Also, don't forget about https://github.com/RGF-team/rgf/blob/master/R-package/.R.appveyor.ps1 😉
I think we're getting pretty close! Made good progress this week.
gcc/g++
8.1.0The 1 NOTE referenced above is actually not a problem. If you run R CMD CHECK
on a package that is not on CRAN yet, it will throw up a note saying "hey this is a new package that hasn't been seen on CRAN before". So that one won't be a problem when we submit.
Next step after getting those 4 PRs reviewed and merged will be to add Windows CI (#2335) and (in parallel) to try running win-builder.
Wanted to provide an update for anyone watching this issue.
clang
and gcc
(https://github.com/microsoft/LightGBM/pull/2911#issuecomment-602964871).R CMD CHECK
NOTEs there which seem like they'll be straightforward to address.The next thing we need to figure out is whether or not CRAN will accept a package that does not support 32-bit Windows. Notice that in #2936 I'm running R CMD check --as-cran --no-multiarch
because otherwise when you run on 64-bitt Windows R will try to build and check the package for both architectures. We don't currently support 32-bit R.
I don't see anything specific about architecture requirements in https://cran.r-project.org/web/packages/policies.html#Source-packages. I don't see any 32-bit Windows environments in CRAN's standard set of checks but it's possible that they bold both architectures on a 64-bit system. We will see!
@jameslamb
I don't see any 32-bit Windows environments in CRAN's standard set of checks but it's possible that they bold both architectures on a 64-bit system.
I do see two mentiones of 32bit:
i686-posix-dwarf
i686-posix-dwarf
ah! I've never heard of i686 before. Ok, that might be a thing we have to figure out. We'll get there.
ah! I've never heard of i686 before. Ok, that might be a thing we have to figure out. We'll get there.
Maybe we just can wait for a while? 😄
Beginning with Windows 10, version 2004, all new Windows 10 systems will be required to use 64-bit builds and Microsoft will no longer release 32-bit builds for OEM distribution. https://www.engadget.com/windows-10-32-bit-oem-173055990.html
Ok I took the current version of the R package on #3188 , since I think that PR will be merged soon, and built an R source package to check on R Hub.
R Hub is a free project that runs R CMD check
on an R package on a bunch of platforms.
It doesn't respond fast enough to be part of our normal CI, and it is independent of CRAN so the environments there are slightly different, but it is a free way to test on a lot of setups, especially older Windows environments.
The results are below. I don't know how long the log links last, but I included those as well
Not all of the problems below are problems we will face on CRAN, and none of them should block #3188 . But a few are worth looking into, with more targeted uses of rhub::check()
.
❌ Oracle Solaris 10, x86, 32 bit, R-release, Oracle Developer Studio 12.6
❌ Oracle Solaris 10, x86, 32 bit, R-release
✅ macOS 10.13.6 High Sierra, R-release, CRAN's setup
✅ macOS 10.13.6 High Sierra, R-release, brew
⚠️ Ubuntu Linux 16.04 LTS, R-release, GCC
⚠️ Fedora Linux, R-devel, clang, gfortran
✅ Windows Server 2008 R2 SP1, R-oldrel, 32/64 bit
✅ Windows Server 2008 R2 SP1, R-release, 32/64 bit
✅ Windows Server 2008 R2 SP1, R-patched, 32/64 bit
✅ Windows Server 2008 R2 SP1, R-devel, 32/64 bit
✅ Windows Server 2008 R2 SP1, R-release, 32/64 bit
⚠️ Ubuntu Linux 16.04 LTS, R-devel, GCC
⚠️ Debian Linux, R-devel, GCC, no long double
⚠️ Debian Linux, R-devel, clang, ISO-8859-15 locale
❌ Debian Linux, R-devel, GCC ASAN/UBSAN
❌ CentOS 6 with Redhat Developer Toolset, R from EPEL
The following notes and warnings can be safely ignored and should not block us on CRAN.
* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Guolin Ke <guolin.ke@microsoft.com>’
New submission
License components with restrictions and base license permitting such:
MIT + file LICENSE
* checking whether package ‘lightgbm’ can be installed ... WARNING
Found the following significant warnings:
Warning: package ‘R6’ was built under R version 4.0.2
See ‘/Users/userllqNxUln/lightgbm.Rcheck/00install.out’ for details.
* checking top-level files ... WARNING
A complete check needs the 'checkbashisms' script.
See section ‘Configure and cleanup’ in the ‘Writing R Extensions’
manual.
* checking installed package size ... NOTE
installed size is 35.7Mb
sub-directories of 1Mb or more:
libs 35.1Mb
@jameslamb
for oracle Solaris, maybe we can disable the network module.
for the misaligned
error, maybe the removal of MM_MALLOC
is the root cause?
Ah, Solaris 11 has ifaddrs.h
file, but unfortunately CRAN runs Solaris 10: https://issues.prosody.im/419
for oracle Solaris, maybe we can disable the network module.
I was thinking the same thing!
for the misaligned error, maybe the removal of MM_MALLOC is the root cause?
I don't think so. On the build with that issue, I see this
checking whether MM_PREFETCH works... yes
checking whether MM_MALLOC works... yes
and -DMM_MALLOC=1
in the compiler statements.
Maybe we can use a patch for Solaris 10 from psutil
?
https://github.com/giampaolo/psutil/blob/master/psutil/arch/solaris/v10/ifaddrs.h https://github.com/giampaolo/psutil/blob/master/psutil/arch/solaris/v10/ifaddrs.c
It looks quite small.
Environment info
Operating System: Windows 8.1 Pro CPU: i7-4600U R version: 3.4
To make a release on CRAN, we will need first to fix all the errors / warnings / notes. Currently testing on Windows, but we will also need to test on Linux. If some of them cannot be fixed, we will need to have an explanation for each of those which will not be fixed by us. @guolinke
Maybe time to add some vignettes @coforfe if you want to work on them.
00install.out:
Windows (fake) CRAN log: