bnosac / ruimtehol

R package to Embed All the Things! using StarSpace
Mozilla Public License 2.0
99 stars 13 forks source link

Stack usage Error #2

Closed bob-rietveld closed 5 years ago

bob-rietveld commented 5 years ago

Hi,

I installed the latest dev version and was running into an error. I did a clean install and tried to run the tagspace example but received a similar error message like

Error: C stack usage 17587557196884 is too close to the limit.

On another occasion, I did not get any error message but the process hangs after the first epoch and does not converge. Is there anything I can change in terms of memory of R versions, below is the session info. Thanks for helping out.

B.

Rsession info `R version 3.3.2 (2016-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS 10.13.1

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ruimtehol_0.1 fastrtext_0.2.5

loaded via a namespace (and not attached): [1] httr_1.3.1 assertthat_0.2.0 R6_2.2.2 tools_3.3.2 withr_2.1.2 curl_3.1 yaml_2.1.15 Rcpp_0.12.18
[9] memoise_1.1.0 codetools_0.2-15 git2r_0.21.0 digest_0.6.13 devtools_1.13.4 **C stack info** Cstack_info() size current direction eval_depth 7969177 16280 1 2 `

jwijffels commented 5 years ago

Can you provide reproducible code on this?

bob-rietveld commented 5 years ago

I just ran the example

library(fastrtext)
library(ruimtehol)
data(train_sentences, package = "fastrtext")

filename <- tempfile()
writeLines(text = paste(paste0("__label__", train_sentences$class.text),  tolower(train_sentences$text)),
           con = filename)

model <- starspace(file = filename, 
                   trainMode = 0, label = "__label__", 
                   similarity = "dot", verbose = TRUE, initRandSd = 0.01, adagrad = FALSE, 
                   ngrams = 1, lr = 0.01, epoch = 5, thread = 20, dim = 10, negSearchLimit = 5, maxNegSamples = 3)
jwijffels commented 5 years ago

I've developped this on Windows, I still need to check on other platforms.

jwijffels commented 5 years ago

Note to myself, this is probably related to Rcpp::cout of the printing of the evolution of the loss.

jwijffels commented 5 years ago

The problem was that the printing of the trace of the training log was too big. For the time being, I've solved this by sending the training log to cerr and cout instead of Rcpp::cout. Tested this on Ubuntu also which is linux. So I presume this will now also fix the problem you have on MacOS. Please re-install and try out.

bob-rietveld commented 5 years ago

Thanks for the quick response /fix. The error has been resolved.

I have a different issue now however. After starting training (using code above) the process just hangs (see output below). It does not proceed to the next epoch. I do not get any error message and am not sure how to debug this.

I can see the CPU is busy with something, memory usage of Rstudio is ok. Thanks for your continued help.

Start to initialize starspace model.
Build dict from input file : /var/folders/hs/yw76yd_95lscwclwg15n73tw0000gn/T//RtmpSLtNje/filecda1835e174
Read 0M words
Number of words in dictionary:  5060
Number of labels in dictionary: 15
Loading data from file : /var/folders/hs/yw76yd_95lscwclwg15n73tw0000gn/T//RtmpSLtNje/filecda1835e174
Total number of examples loaded : 2517
Initialized model weights. Model size :
matrix : 5075 10
Training epoch 0: 0.01 0.002
Epoch: 99.2%  lr: 0.010000  loss: 0.029883  eta: <1min   tot: 0h0m0s  (19.8%)
jwijffels commented 5 years ago

Thanks. Please make a new issue as this one seems no longer to be the same issue.