h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

Set up a drat repository for the h2o R package #15284

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Since the CRAN release is typically usually >=1 releases behind the latest stable version, we could set up a [drat repository|https://github.com/eddelbuettel/drat] for the h2o R package. This would allow users to get the stable release updates via the usual update.packages functionality, rather than waiting for CRAN.

This will prevent us from having to "solve" issues on h2oStream that have already been addressed. A lot of our users appear to be using CRAN version, so they end up asking questions about bugs that have already been resolved.

More info on how rOpenSci set up their own drat repo on this [blog post|https://ropensci.org/blog/2015/08/04/a-drat-repository-for-ropensci/].

exalate-issue-sync[bot] commented 1 year ago

Erin LeDell commented: [~accountid:557058:15ff181d-2f0b-4dfb-84e3-6a073bd7a9d2] Do you know if a DRAT repo for the h2o package will also install the R package dependencies automatically?

exalate-issue-sync[bot] commented 1 year ago

Jan Gorecki commented: It could. There are 37 recursive deps of h2o, all could be included together with R h2o. Otherwise user will need to provide two repos when using install.packages to ensure deps can be resolved. I will write a bigger note during this weekend with my remarks on current R build process, including releasing drat.

exalate-issue-sync[bot] commented 1 year ago

Jan Gorecki commented: Drat is generally ready but build on gitlab.0xdata.loc, which is not part of any workflow. Anmol may know something about migration this to jenkins. Drat repo(s) includes all recursive deps so can be resolved automatically in base R, and installed as source packages. Win and Mac builders must produce pkg binary exported as artifacts to drat-builder job. Do you think h2oEnsemble, RSparkling should be in the same drat repo, or each pkg in own?

exalate-issue-sync[bot] commented 1 year ago

Erin LeDell commented: I don't have a strong preference on same drat repo vs individual repos, whatever makes the most sense.

exalate-issue-sync[bot] commented 1 year ago

Jan Gorecki commented: To push this issue a little bit forward I would like to better define scope. We need the following R repositories:

  1. "stable" recent release branches for h2o, h2oEnsemble and rsparkling (+ recursive deps)
  2. "nigthly"/"master" recent master branches for h2o, h2oEnsemble and rsparkling
  3. "stable+jar" recent release branch for h2o with h2o.jar included - AFAIK some customers expect to have jar included in h2o pkg
  4. "h2o-3" build & test R deps

This will handle all needed scenarios AFAIU, please comment if you see something redundant or missing. [~accountid:557058:1529d34e-fbf0-45b1-8f33-facc04ecb672] [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c] [~accountid:557058:389d9607-5bd8-4611-8c6a-755fe9295223] [~accountid:557058:58c8a123-f885-400f-9d0a-d4b9277a4951]

Most of above is ready, or easy to add. What is missing is the public url (i.e. cran.h2o.ai) that will point to a webserver, or eventually s3 bucket.

DinukaH2O commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-2377 Assignee: Jan Gorecki Reporter: Erin LeDell State: Open Fix Version: N/A Attachments: Available (Count: 1) Development PRs: N/A

Attachments From Jira

Attachment Name: Screen Shot 2016-03-15 at 3.21.21 PM.png Attached By: Erin LeDell File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-2377/Screen Shot 2016-03-15 at 3.21.21 PM.png