Open antoinecarme opened 1 year ago
Need to install all needed r-cran-XXXXXXXXX packages in debian.
Most needed : r-cran-forecast and r-cran-caret
antoine@z600:~/dev/python/packages/timeseries/pyaf$ apt-cache show r-cran-forecast
Package: r-cran-forecast
Version: 8.17.0-1
Installed-Size: 1914
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Architecture: amd64
Depends: r-base-core (>= 4.2.1-1), r-api-4.0, r-cran-colorspace, r-cran-fracdiff, r-cran-generics (>= 0.1.2), r-cran-ggplot2 (>= 2.2.1), r-cran-lmtest, r-cran-magrittr, r-cran-nnet, r-cran-rcpp (>= 0.11.0), r-cran-timedate, r-cran-tseries, r-cran-urca, r-cran-zoo, r-cran-rcpparmadillo (>= 0.2.35), libblas3 | libblas.so.3, libc6 (>= 2.29), libgcc-s1 (>= 3.0), libstdc++6 (>= 11)
Recommends: r-cran-testthat, r-cran-uroot
Suggests: r-cran-knitr, r-cran-rmarkdown
Description-en: GNU R forecasting functions for time series and linear models
Methods and tools for displaying and analysing
univariate time series forecasts including exponential smoothing
via state space models and automatic ARIMA modelling.
Description-md5: fbe002920852e5d23ff950431c9f03c4
Homepage: https://cran.r-project.org/package=forecast
Section: gnu-r
Priority: optional
Filename: pool/main/r/r-cran-forecast/r-cran-forecast_8.17.0-1_amd64.deb
Size: 1540732
MD5sum: ad90255623ef7f6c6719b7befca32f49
antoine@z600:~/dev/python/packages/timeseries/pyaf$ apt-cache show r-cran-caret
Package: r-cran-caret
Version: 6.0-93+dfsg-1
Installed-Size: 3668
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Architecture: amd64
Depends: r-base-core (>= 4.2.1-2), r-api-4.0, r-cran-ggplot2, r-cran-lattice (>= 0.20), r-cran-e1071, r-cran-foreach, r-cran-modelmetrics (>= 1.2.2.2), r-cran-nlme, r-cran-plyr, r-cran-proc, r-cran-recipes (>= 0.1.10), r-cran-reshape2, r-cran-withr (>= 2.0.0), libc6 (>= 2.4)
Recommends: r-cran-testthat (>= 0.9.1), r-cran-earth (>= 2.2-3), r-cran-mda, r-cran-mlmetrics, r-cran-fastica, r-cran-kernlab, r-cran-themis (>= 0.1.3)
Suggests: r-cran-bradleyterry2, r-cran-covr, r-cran-dplyr, r-cran-ellipse, r-cran-gam (>= 1.15), r-cran-ipred, r-cran-knitr, r-cran-mass, r-cran-matrix, r-cran-mgcv, r-cran-mlbench, r-cran-nnet, r-cran-party (>= 0.9-99992), r-cran-pls, r-cran-proxy, r-cran-randomforest, r-cran-rann, r-cran-rmarkdown, r-cran-rpart
Description-en: GNU R package for classification and regression training
This GNU R package provides misc functions for training and plotting
classification and regression models.
Description-md5: 568fff6316b184e50b859b0f39211d0d
Homepage: https://cran.r-project.org/package=caret
Section: gnu-r
Priority: optional
Filename: pool/main/r/r-cran-caret/r-cran-caret_6.0-93+dfsg-1_amd64.deb
Size: 3446832
MD5sum: d81b051a65be49cff8f69a1828f3bc3d
SHA256: 8225d86fd41959ba6c4314b0b3df39ff2f93fb5cd0218500bf4dc4f4d684151a
Need to have a set of pyaf models that build custom R scripts to internally build the corresponding R forecasting models.
This is a prototyping environment, can be slow and that's OK.
All the logs coming from R should be properly saved under /tmp/pyaf_prototyping/model_name_session/(train|predict).(err | log)
Training script saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/train.R
Training dataset saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/training.csv
R models saved in R (and reloaded before each forecast/predict) under /tmp/pyaf_prototyping/model_name/model.rds
Forecasting/predict script saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/predict.R
Forecast/predict dataset saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/mode_name_input.csv
mode_name should contain the type of model (TAR, TSMARS, ...) and a unique string (date , process_id , ) etc.
output datasets saved by R (and used in python) under /tmp/pyaf_prototyping/model_name/mode_name_output.csv
Sample R training script for Threshold AR models (auto-generated by pyaf for each internal model)
write('', "/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.lock")
options(warn=1);
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.log" , open="wt"), type="output");
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.err" , open="wt"), type="message");
set.seed(1960)
paste("R_VERSION" , R.version.string)
df = read.csv("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/training.csv", header=TRUE)
library(NTS, quietly = TRUE);
cat("R_PACKAGE_VERSION", "NTS", toString(packageVersion("NTS")) , "\n");
thresholds.est = uTAR(y=df$TGT, p1=2, p2=2, d=2, thrQ=c(0,1), Trim=c(0.1,0.9), include.mean=TRUE, method="NeSS", k0=50);
model = uTAR.est(y=df$TGT, , arorder=c(2,2), thr=thresholds.est$thr, d=2);
saveRDS(model, "/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/model.rds")
file.remove("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.lock")
sink(type="output");
sink(type="message");
print('end')
Sample forecast/predict script for Threshold AR models (auto-generated by pyaf for each model forecast)
write('', "/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.lock")
options(warn=1);
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.log" , open="wt"), type="output");
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.err" , open="wt"), type="message");
paste("R_VERSION" , R.version.string)
df = read.csv("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208_input.csv", header=TRUE)
reloaded_model = readRDS("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/model.rds")
library(NTS, quietly = TRUE);
cat("R_PACKAGE_VERSION", "NTS", toString(packageVersion("NTS")) , "\n");
predicted = uTAR.pred(mode=reloaded_model, orig=0 , h=204 - sum(reloaded_model$nobs),iterations=100,ci=0.95,output=TRUE)
nempty = length(reloaded_model$data) - length(reloaded_model$residuals)
residuals = rbind(matrix(0, nempty) , matrix(reloaded_model$residuals))
data = reloaded_model$data
fitted = data + residuals
predicted = rbind(fitted, predicted$pred)
write.csv(predicted, file = "/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208_output.csv")
file.remove("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.lock")
sink(type="output");
sink(type="message");
print('end')
Sample MARS model using R Caret prototyping.
Sample TAR Model using R NTS package
R_modeling branch :
https://github.com/antoinecarme/pyaf/tree/R_modeling/
Specific prototyping tests :
https://github.com/antoinecarme/pyaf/tree/R_modeling/tests/caret_r_prototypes
It is useful to have a git branch which contains all the necessary toolkit for prototyping.
Make it possible to use R/forecast from inside pyaf. "Fake" pyaf models which call R to validate a specific implementation.
This branch is not to be merged.
First application : Threshold AR models #214 and TSMARS models #215