Closed bob-rietveld closed 5 years ago
Can you provide reproducible code on this?
I just ran the example
library(fastrtext)
library(ruimtehol)
data(train_sentences, package = "fastrtext")
filename <- tempfile()
writeLines(text = paste(paste0("__label__", train_sentences$class.text), tolower(train_sentences$text)),
con = filename)
model <- starspace(file = filename,
trainMode = 0, label = "__label__",
similarity = "dot", verbose = TRUE, initRandSd = 0.01, adagrad = FALSE,
ngrams = 1, lr = 0.01, epoch = 5, thread = 20, dim = 10, negSearchLimit = 5, maxNegSamples = 3)
I've developped this on Windows, I still need to check on other platforms.
Note to myself, this is probably related to Rcpp::cout of the printing of the evolution of the loss.
The problem was that the printing of the trace of the training log was too big. For the time being, I've solved this by sending the training log to cerr and cout instead of Rcpp::cout. Tested this on Ubuntu also which is linux. So I presume this will now also fix the problem you have on MacOS. Please re-install and try out.
Thanks for the quick response /fix. The error has been resolved.
I have a different issue now however. After starting training (using code above) the process just hangs (see output below). It does not proceed to the next epoch. I do not get any error message and am not sure how to debug this.
I can see the CPU is busy with something, memory usage of Rstudio is ok. Thanks for your continued help.
Start to initialize starspace model.
Build dict from input file : /var/folders/hs/yw76yd_95lscwclwg15n73tw0000gn/T//RtmpSLtNje/filecda1835e174
Read 0M words
Number of words in dictionary: 5060
Number of labels in dictionary: 15
Loading data from file : /var/folders/hs/yw76yd_95lscwclwg15n73tw0000gn/T//RtmpSLtNje/filecda1835e174
Total number of examples loaded : 2517
Initialized model weights. Model size :
matrix : 5075 10
Training epoch 0: 0.01 0.002
Epoch: 99.2% lr: 0.010000 loss: 0.029883 eta: <1min tot: 0h0m0s (19.8%)
Thanks. Please make a new issue as this one seems no longer to be the same issue.
Hi,
I installed the latest dev version and was running into an error. I did a clean install and tried to run the tagspace example but received a similar error message like
Error: C stack usage 17587557196884 is too close to the limit
.On another occasion, I did not get any error message but the process hangs after the first epoch and does not converge. Is there anything I can change in terms of memory of R versions, below is the session info. Thanks for helping out.
B.
Rsession info `R version 3.3.2 (2016-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS 10.13.1
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] ruimtehol_0.1 fastrtext_0.2.5
loaded via a namespace (and not attached): [1] httr_1.3.1 assertthat_0.2.0 R6_2.2.2 tools_3.3.2 withr_2.1.2 curl_3.1 yaml_2.1.15 Rcpp_0.12.18
[9] memoise_1.1.0 codetools_0.2-15 git2r_0.21.0 digest_0.6.13 devtools_1.13.4
**C stack info**
Cstack_info() size current direction eval_depth 7969177 16280 1 2 `