bnosac / doc2vec

Distributed Representations of Sentences and Documents
Other
46 stars 5 forks source link

valgrind memory / Address Sanitizers checks #12

Closed jwijffels closed 3 years ago

jwijffels commented 3 years ago

Valgrind

==1826472== Memcheck, a memory error detector
==1826472== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1826472== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==1826472== Command: /data/blackswan/ripley/R/R-devel-vg/bin/exec/R --vanilla
==1826472== 

R Under development (unstable) (2020-12-09 r79601) -- "Unsuffered Consequences"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> pkgname <- "doc2vec"
> source(file.path(R.home("share"), "R", "examples-header.R"))
> options(warn = 1)
> library('doc2vec')
> 
> base::assign(".oldSearch", base::search(), pos = 'CheckExEnv')
> base::assign(".old_wd", base::getwd(), pos = 'CheckExEnv')
> cleanEx()
> nameEx("as.matrix.paragraph2vec")
> ### * as.matrix.paragraph2vec
> 
> flush(stderr()); flush(stdout())
> 
> ### Name: as.matrix.paragraph2vec
> ### Title: Get the document or word vectors of a paragraph2vec model
> ### Aliases: as.matrix.paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ data(belgium_parliament, package = "tokenizers.bpe")
+ x <- subset(belgium_parliament, language %in% "french")
+ x <- subset(x, nchar(text) > 0 & txt_count(text, pattern = " ") < 1000)
+ 
+ model <- paragraph2vec(x = x, type = "PV-DM",   dim = 15,  iter = 5)
+ 
+ embedding <- as.matrix(model, which = "docs")
+ embedding <- as.matrix(model, which = "words")
+ embedding <- as.matrix(model, which = "docs", normalize = FALSE)
+ embedding <- as.matrix(model, which = "words", normalize = FALSE)
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
==1826472== Warning: set address range perms: large range [0x2ffcf040, 0x47d47440) (undefined)
> ## End(Don't show)
> 
> 
> 
> cleanEx()

detaching ‘package:udpipe’, ‘package:tokenizers.bpe’

> nameEx("paragraph2vec")
> ### * paragraph2vec
> 
> flush(stderr()); flush(stdout())
> 
> ### Name: paragraph2vec
> ### Title: Train a paragraph2vec also known as doc2vec model on text
> ### Aliases: paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ ## Take data and standardise it a bit
+ data(belgium_parliament, package = "tokenizers.bpe")
+ str(belgium_parliament)
+ x <- subset(belgium_parliament, language %in% "french")
+ x$text   <- tolower(x$text)
+ x$text   <- gsub("[^[:alpha:]]", " ", x$text)
+ x$text   <- gsub("[[:space:]]+", " ", x$text)
+ x$text   <- trimws(x$text)
+ x$nwords <- txt_count(x$text, pattern = " ")
+ x <- subset(x, nwords < 1000 & nchar(text) > 0)
+ 
+ ## Build the model
+ model <- paragraph2vec(x = x, type = "PV-DM",   dim = 15,  iter = 5)
+ str(model)
+ embedding <- as.matrix(model, which = "words")
+ embedding <- as.matrix(model, which = "docs")
+ head(embedding)
+ 
+ ## Get vocabulary
+ vocab <- summary(model, type = "vocabulary",  which = "docs")
+ vocab <- summary(model, type = "vocabulary",  which = "words")
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
'data.frame':   2000 obs. of  3 variables:
 $ doc_id  : chr  "http://data.dekamer.be/v0/qrva/54-B144-14-1021-2017201819553" "http://data.dekamer.be/v0/qrva/54-B141-4-1075-2017201820260" "http://data.dekamer.be/v0/qrva/54-B143-4-1074-2017201820256" "http://data.dekamer.be/v0/qrva/54-B143-4-1076-2017201820265" ...
 $ text    : chr  "Percentage vrouwen met een eenoudergezin. \n\n In Wallonie werden de eenoudergezinnen onlangs gescreend. Daarui"| __truncated__ "Bescherming van de gegevens van kinderen. \n\n Op 25 mei 2018 zal de Algemene Verordening Gegevensbescherming ("| __truncated__ "Snel breedbandinternet. \n\n In het kader van het Plan voor ultrasnel internet in Belgie hebt u in 2015 uw voor"| __truncated__ "Rapport van UNICEF. - 'Danger in the air'. \n\n UNICEF heeft recent een nieuw rapport over de impact van luchtv"| __truncated__ ...
 $ language: Factor w/ 2 levels "dutch","french": 1 1 1 1 1 1 1 1 1 1 ...
==1826472== Invalid read of size 8
==1826472==    at 0x17343E09: WMD::~WMD() (packages/tests-vg/doc2vec/src/doc2vec/WMD.cpp:27)
==1826472==    by 0x1733A8A4: Doc2Vec::~Doc2Vec() (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:31)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:30)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:29)
==1826472==    by 0x17349F0F: finalizer_wrapper<Doc2Vec, Rcpp::standard_delete_finalizer<Doc2Vec> > (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:47)
==1826472==    by 0x17349F0F: void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:34)
==1826472==    by 0x52BD9C: R_RunWeakRefFinalizer (svn/R-devel/src/main/memory.c:1469)
==1826472==    by 0x52BFC8: RunFinalizers.isra.0 (svn/R-devel/src/main/memory.c:1536)
==1826472==    by 0x4DC8D4: bc_check_sigint (svn/R-devel/src/main/eval.c:5529)
==1826472==    by 0x4DC8D4: bcEval (svn/R-devel/src/main/eval.c:6723)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x532645: dispatchMethod (svn/R-devel/src/main/objects.c:436)
==1826472==    by 0x5329F2: Rf_usemethod (svn/R-devel/src/main/objects.c:486)
==1826472==    by 0x532DA4: do_usemethod (svn/R-devel/src/main/objects.c:565)
==1826472==  Address 0x1afb26c0 is 48 bytes inside a block of size 80 free'd
==1826472==    at 0x483BEDD: operator delete(void*) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:584)
==1826472==    by 0x1733A892: Doc2Vec::~Doc2Vec() (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:30)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:30)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:29)
==1826472==    by 0x17349F0F: finalizer_wrapper<Doc2Vec, Rcpp::standard_delete_finalizer<Doc2Vec> > (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:47)
==1826472==    by 0x17349F0F: void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:34)
==1826472==    by 0x52BD9C: R_RunWeakRefFinalizer (svn/R-devel/src/main/memory.c:1469)
==1826472==    by 0x52BFC8: RunFinalizers.isra.0 (svn/R-devel/src/main/memory.c:1536)
==1826472==    by 0x4DC8D4: bc_check_sigint (svn/R-devel/src/main/eval.c:5529)
==1826472==    by 0x4DC8D4: bcEval (svn/R-devel/src/main/eval.c:6723)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x532645: dispatchMethod (svn/R-devel/src/main/objects.c:436)
==1826472==    by 0x5329F2: Rf_usemethod (svn/R-devel/src/main/objects.c:486)
==1826472==    by 0x532DA4: do_usemethod (svn/R-devel/src/main/objects.c:565)
==1826472==  Block was alloc'd at
==1826472==    at 0x483AE7D: operator new(unsigned long) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:342)
==1826472==    by 0x1733B912: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:85)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472== 
==1826472== Invalid read of size 8
==1826472==    at 0x17343E37: WMD::~WMD() (packages/tests-vg/doc2vec/src/doc2vec/WMD.cpp:27)
==1826472==    by 0x1733A8A4: Doc2Vec::~Doc2Vec() (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:31)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:30)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:29)
==1826472==    by 0x17349F0F: finalizer_wrapper<Doc2Vec, Rcpp::standard_delete_finalizer<Doc2Vec> > (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:47)
==1826472==    by 0x17349F0F: void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:34)
==1826472==    by 0x52BD9C: R_RunWeakRefFinalizer (svn/R-devel/src/main/memory.c:1469)
==1826472==    by 0x52BFC8: RunFinalizers.isra.0 (svn/R-devel/src/main/memory.c:1536)
==1826472==    by 0x4DC8D4: bc_check_sigint (svn/R-devel/src/main/eval.c:5529)
==1826472==    by 0x4DC8D4: bcEval (svn/R-devel/src/main/eval.c:6723)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x532645: dispatchMethod (svn/R-devel/src/main/objects.c:436)
==1826472==    by 0x5329F2: Rf_usemethod (svn/R-devel/src/main/objects.c:486)
==1826472==    by 0x532DA4: do_usemethod (svn/R-devel/src/main/objects.c:565)
==1826472==  Address 0x1afb26c0 is 48 bytes inside a block of size 80 free'd
==1826472==    at 0x483BEDD: operator delete(void*) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:584)
==1826472==    by 0x1733A892: Doc2Vec::~Doc2Vec() (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:30)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:30)
==1826472==    by 0x17349F0F: standard_delete_finalizer<Doc2Vec> (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:29)
==1826472==    by 0x17349F0F: finalizer_wrapper<Doc2Vec, Rcpp::standard_delete_finalizer<Doc2Vec> > (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:47)
==1826472==    by 0x17349F0F: void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) (R-devel/site-library/Rcpp/include/Rcpp/XPtr.h:34)
==1826472==    by 0x52BD9C: R_RunWeakRefFinalizer (svn/R-devel/src/main/memory.c:1469)
==1826472==    by 0x52BFC8: RunFinalizers.isra.0 (svn/R-devel/src/main/memory.c:1536)
==1826472==    by 0x4DC8D4: bc_check_sigint (svn/R-devel/src/main/eval.c:5529)
==1826472==    by 0x4DC8D4: bcEval (svn/R-devel/src/main/eval.c:6723)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x532645: dispatchMethod (svn/R-devel/src/main/objects.c:436)
==1826472==    by 0x5329F2: Rf_usemethod (svn/R-devel/src/main/objects.c:486)
==1826472==    by 0x532DA4: do_usemethod (svn/R-devel/src/main/objects.c:565)
==1826472==  Block was alloc'd at
==1826472==    at 0x483AE7D: operator new(unsigned long) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:342)
==1826472==    by 0x1733B912: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:85)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472== 
==1826472== Warning: set address range perms: large range [0x2ffcf028, 0x47d47458) (noaccess)
==1826472== Warning: set address range perms: large range [0x2f7cf040, 0x47547440) (undefined)
List of 3
 $ model  :<externalptr> 
 $ data   :List of 4
  ..$ file        : chr "/tmp/RtmpEAhEtr/textspace_1bdea875e9828c.txt"
  ..$ n           : num 203713
  ..$ n_vocabulary: num 4254
  ..$ n_docs      : num 999
 $ control:List of 9
  ..$ min_count: int 5
  ..$ dim      : int 15
  ..$ window   : int 5
  ..$ iter     : int 5
  ..$ lr       : num 0.05
  ..$ skipgram : logi FALSE
  ..$ hs       : int 0
  ..$ negative : int 5
  ..$ sample   : num 0.001
 - attr(*, "class")= chr "paragraph2vec_trained"
> ## End(Don't show)
> 
> 
> 
> cleanEx()

detaching ‘package:udpipe’, ‘package:tokenizers.bpe’

> nameEx("paragraph2vec_similarity")
> ### * paragraph2vec_similarity
> 
> flush(stderr()); flush(stdout())
> 
> ### Name: paragraph2vec_similarity
> ### Title: Similarity between document / word vectors as used in
> ###   paragraph2vec
> ### Aliases: paragraph2vec_similarity
> 
> ### ** Examples
> 
> x <- matrix(rnorm(6), nrow = 2, ncol = 3)
> rownames(x) <- c("word1", "word2")
> y <- matrix(rnorm(15), nrow = 5, ncol = 3)
> rownames(y) <- c("doc1", "doc2", "doc3", "doc4", "doc5")
> 
> paragraph2vec_similarity(x, y)
            doc1       doc2      doc3       doc4       doc5
word1 -0.6364508  0.3676014  1.760565 -0.5530176 -0.6067031
word2  0.7247061 -1.6298525 -4.101116  1.2512209 -0.5480451
> paragraph2vec_similarity(x, y, top_n = 1)
  term1 term2 similarity rank
1 word2  doc4   1.251221    1
2 word1  doc3   1.760565    1
> paragraph2vec_similarity(x, y, top_n = 2)
  term1 term2 similarity rank
1 word2  doc4  1.2512209    1
2 word2  doc1  0.7247061    2
3 word1  doc3  1.7605649    1
4 word1  doc2  0.3676014    2
> paragraph2vec_similarity(x, y, top_n = +Inf)
   term1 term2 similarity rank
1  word2  doc4  1.2512209    1
2  word2  doc1  0.7247061    2
3  word2  doc5 -0.5480451    3
4  word2  doc2 -1.6298525    4
5  word2  doc3 -4.1011158    5
6  word1  doc3  1.7605649    1
7  word1  doc2  0.3676014    2
8  word1  doc4 -0.5530176    3
9  word1  doc5 -0.6067031    4
10 word1  doc1 -0.6364508    5
> paragraph2vec_similarity(y, y)
           doc1       doc2      doc3        doc4       doc5
doc1  0.3898270  0.1024135 -0.596029  0.28007612 0.70449051
doc2  0.1024135  1.8218900  2.576073 -0.36378296 2.01146409
doc3 -0.5960290  2.5760733  5.910824 -2.17949696 1.72465356
doc4  0.2800761 -0.3637830 -2.179497  1.71145042 0.03355426
doc5  0.7044905  2.0114641  1.724654  0.03355426 3.13202074
> paragraph2vec_similarity(y, y, top_n = 1)
  term1 term2 similarity rank
1  doc5  doc5  3.1320207    1
2  doc4  doc4  1.7114504    1
3  doc3  doc3  5.9108240    1
4  doc2  doc3  2.5760733    1
5  doc1  doc5  0.7044905    1
> paragraph2vec_similarity(y, y, top_n = 2)
   term1 term2 similarity rank
1   doc5  doc5  3.1320207    1
2   doc5  doc2  2.0114641    2
3   doc4  doc4  1.7114504    1
4   doc4  doc1  0.2800761    2
5   doc3  doc3  5.9108240    1
6   doc3  doc2  2.5760733    2
7   doc2  doc3  2.5760733    1
8   doc2  doc5  2.0114641    2
9   doc1  doc5  0.7044905    1
10  doc1  doc1  0.3898270    2
> paragraph2vec_similarity(y, y, top_n = +Inf)
   term1 term2  similarity rank
1   doc5  doc5  3.13202074    1
2   doc5  doc2  2.01146409    2
3   doc5  doc3  1.72465356    3
4   doc5  doc1  0.70449051    4
5   doc5  doc4  0.03355426    5
6   doc4  doc4  1.71145042    1
7   doc4  doc1  0.28007612    2
8   doc4  doc5  0.03355426    3
9   doc4  doc2 -0.36378296    4
10  doc4  doc3 -2.17949696    5
11  doc3  doc3  5.91082401    1
12  doc3  doc2  2.57607334    2
13  doc3  doc5  1.72465356    3
14  doc3  doc1 -0.59602900    4
15  doc3  doc4 -2.17949696    5
16  doc2  doc3  2.57607334    1
17  doc2  doc5  2.01146409    2
18  doc2  doc2  1.82189002    3
19  doc2  doc1  0.10241352    4
20  doc2  doc4 -0.36378296    5
21  doc1  doc5  0.70449051    1
22  doc1  doc1  0.38982695    2
23  doc1  doc4  0.28007612    3
24  doc1  doc2  0.10241352    4
25  doc1  doc3 -0.59602900    5
> 
> 
> 
> cleanEx()
> nameEx("predict.paragraph2vec")
> ### * predict.paragraph2vec
> 
> flush(stderr()); flush(stdout())
> 
> ### Name: predict.paragraph2vec
> ### Title: Predict functionalities for a paragraph2vec model
> ### Aliases: predict.paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ data(belgium_parliament, package = "tokenizers.bpe")
+ x <- belgium_parliament
+ x <- subset(x, language %in% "dutch")
+ x <- subset(x, nchar(text) > 0 & txt_count(text, pattern = " ") < 1000)
+ x$doc_id <- sprintf("doc_%s", 1:nrow(x))
+ x$text   <- tolower(x$text)
+ x$text   <- gsub("[^[:alpha:]]", " ", x$text)
+ x$text   <- gsub("[[:space:]]+", " ", x$text)
+ x$text   <- trimws(x$text)
+ 
+ ## Build model
+ model <- paragraph2vec(x = x, type = "PV-DM",   dim = 15,  iter = 5)
+ 
+ sentences <- list(
+   example = c("geld", "diabetes"),
+   hi = c("geld", "diabetes", "koning"),
+   test = c("geld"),
+   nothing = character(), 
+   repr = c("geld", "diabetes", "koning"))
+   
+ ## Get embeddings (type =  'embedding')
+ predict(model, newdata = c("geld", "koning", "unknownword", NA, "</s>", ""), 
+                type = "embedding", which = "words")
+ predict(model, newdata = c("doc_1", "doc_10", "unknowndoc", NA, "</s>"), 
+                type = "embedding", which = "docs")
+ predict(model, sentences, type = "embedding")
+ 
+ ## Get most similar items (type =  'nearest')
+ predict(model, newdata = c("doc_1", "doc_10"), type = "nearest", which = "doc2doc")
+ predict(model, newdata = c("geld", "koning"), type = "nearest", which = "word2doc")
+ predict(model, newdata = c("geld", "koning"), type = "nearest", which = "word2word")
+ predict(model, newdata = sentences, type = "nearest", which = "sent2doc", top_n = 7)
+ 
+ ## Similar way on extracting similarities
+ emb <- predict(model, sentences, type = "embedding")
+ emb_docs <- as.matrix(model, type = "docs")
+ paragraph2vec_similarity(emb, emb_docs, top_n = 3)
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
==1826472== Warning: set address range perms: large range [0x2f7cf028, 0x47547458) (noaccess)
==1826472== Warning: set address range perms: large range [0x2f7cf040, 0x47547440) (undefined)
     term1   term2 similarity rank
1     test doc_285  0.9791637    1
2     test doc_807  0.9765480    2
3     test doc_195  0.9696226    3
4     repr doc_424  0.9917132    1
5     repr doc_101  0.9901011    2
6     repr doc_199  0.9894081    3
7  nothing doc_523  0.7684932    1
8  nothing doc_807  0.6819410    2
9  nothing doc_923  0.6805237    3
10      hi doc_424  0.9917132    1
11      hi doc_101  0.9901011    2
12      hi doc_199  0.9894081    3
13 example doc_424  0.9853744    1
14 example doc_199  0.9818371    2
15 example doc_790  0.9786496    3
> ## End(Don't show)
> 
> 
> 
> cleanEx()

detaching ‘package:udpipe’, ‘package:tokenizers.bpe’

> nameEx("read.paragraph2vec")
> ### * read.paragraph2vec
> 
> flush(stderr()); flush(stdout())
> 
> ### Name: read.paragraph2vec
> ### Title: Read a binary paragraph2vec model from disk
> ### Aliases: read.paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ data(belgium_parliament, package = "tokenizers.bpe")
+ x <- subset(belgium_parliament, language %in% "french")
+ x <- subset(x, nchar(text) > 0 & txt_count(text, pattern = " ") < 1000)
+ 
+ ## Don't show: 
+ model <- paragraph2vec(x = head(x, 5), 
+                        type = "PV-DM", dim = 5, iter = 1, min_count = 0)
+ ## End(Don't show)
+ path <- "mymodel.bin"
+ ## Don't show: 
+ path <- tempfile(pattern = "paragraph2vec", fileext = ".bin")
+ ## End(Don't show)
+ write.paragraph2vec(model, file = path)
+ model <- read.paragraph2vec(file = path)
+ 
+ vocab <- summary(model, type = "vocabulary", which = "docs")
+ vocab <- summary(model, type = "vocabulary", which = "words")
+ embedding <- as.matrix(model, which = "docs")
+ embedding <- as.matrix(model, which = "words")
+ ## Don't show: 
+ file.remove(path)
+ ## End(Don't show)
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
==1826472== Warning: set address range perms: large range [0x60efa040, 0x78c72440) (undefined)
==1826472== Warning: set address range perms: large range [0x87155040, 0x9eecd440) (undefined)
[1] TRUE
> ## End(Don't show)
> 
> 
> 
> cleanEx()

detaching ‘package:udpipe’, ‘package:tokenizers.bpe’

> nameEx("write.paragraph2vec")
> ### * write.paragraph2vec
> 
> flush(stderr()); flush(stdout())
> 
> ### Name: write.paragraph2vec
> ### Title: Save a paragraph2vec model to disk
> ### Aliases: write.paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ data(belgium_parliament, package = "tokenizers.bpe")
+ x <- subset(belgium_parliament, language %in% "french")
+ x <- subset(x, nchar(text) > 0 & txt_count(text, pattern = " ") < 1000)
+ 
+ ## Don't show: 
+ model <- paragraph2vec(x = head(x, 5), 
+                        type = "PV-DM", dim = 5, iter = 1, min_count = 0)
+ ## End(Don't show)
+ path <- "mymodel.bin"
+ ## Don't show: 
+ path <- tempfile(pattern = "paragraph2vec", fileext = ".bin")
+ ## End(Don't show)
+ write.paragraph2vec(model, file = path)
+ model <- read.paragraph2vec(file = path)
+ 
+ vocab <- summary(model, type = "vocabulary", which = "docs")
+ vocab <- summary(model, type = "vocabulary", which = "words")
+ embedding <- as.matrix(model, which = "docs")
+ embedding <- as.matrix(model, which = "words")
+ ## Don't show: 
+ file.remove(path)
+ ## End(Don't show)
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
==1826472== Warning: set address range perms: large range [0x87155028, 0x9eecd458) (noaccess)
==1826472== Warning: set address range perms: large range [0x60efa028, 0x78c72458) (noaccess)
==1826472== Warning: set address range perms: large range [0x2f7cf028, 0x47547458) (noaccess)
==1826472== Warning: set address range perms: large range [0x2f7cf040, 0x47547440) (undefined)
==1826472== Warning: set address range perms: large range [0x60efa040, 0x78c72440) (undefined)
[1] TRUE
> ## End(Don't show)
> 
> 
> 
> ### * <FOOTER>
> ###
> cleanEx()

detaching ‘package:udpipe’, ‘package:tokenizers.bpe’

> options(digits = 7L)
> base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = 'CheckExEnv'),"\n")
Time elapsed:  626.076 12.991 644.965 0 0 
> grDevices::dev.off()
null device 
          1 
> ###
> ### Local variables: ***
> ### mode: outline-minor ***
> ### outline-regexp: "\\(> \\)?### [*]+" ***
> ### End: ***
> quit('no')
==1826472== 
==1826472== HEAP SUMMARY:
==1826472==     in use at exit: 1,481,452,553 bytes in 107,402 blocks
==1826472==   total heap usage: 4,495,675 allocs, 4,388,273 frees, 5,369,422,879 bytes allocated
==1826472== 
==1826472== 9 bytes in 1 blocks are possibly lost in loss record 13 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17342FAF: Vocabulary::addWordToVocab(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:87)
==1826472==    by 0x1734353A: Vocabulary::loadFromTrainFile(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:67)
==1826472==    by 0x1734391C: Vocabulary::Vocabulary(char const*, int, bool) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:15)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 20 bytes in 4 blocks are definitely lost in loss record 25 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17342FAF: Vocabulary::addWordToVocab(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:87)
==1826472==    by 0x1734358C: Vocabulary::loadFromTrainFile(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:45)
==1826472==    by 0x1734391C: Vocabulary::Vocabulary(char const*, int, bool) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:15)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 44 bytes in 1 blocks are possibly lost in loss record 36 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343C76: Vocabulary::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:278)
==1826472==    by 0x1733B701: Doc2Vec::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:292)
==1826472==    by 0x1734525A: paragraph2vec_load_model(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:58)
==1826472==    by 0x1734DE3D: _doc2vec_paragraph2vec_load_model (packages/tests-vg/doc2vec/src/RcppExports.cpp:48)
==1826472==    by 0x49CF2F: R_doDotCall (svn/R-devel/src/main/dotcode.c:598)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 112 bytes in 2 blocks are definitely lost in loss record 62 of 2,458
==1826472==    at 0x483B582: operator new[](unsigned long) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:431)
==1826472==    by 0x17343D61: WMD::WMD(Doc2Vec*) (packages/tests-vg/doc2vec/src/doc2vec/WMD.cpp:14)
==1826472==    by 0x1733B82E: Doc2Vec::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:307)
==1826472==    by 0x1734525A: paragraph2vec_load_model(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:58)
==1826472==    by 0x1734DE3D: _doc2vec_paragraph2vec_load_model (packages/tests-vg/doc2vec/src/RcppExports.cpp:48)
==1826472==    by 0x49CF2F: R_doDotCall (svn/R-devel/src/main/dotcode.c:598)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 120 bytes in 2 blocks are possibly lost in loss record 63 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17342FAF: Vocabulary::addWordToVocab(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:87)
==1826472==    by 0x17343497: Vocabulary::loadFromTrainFile(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:53)
==1826472==    by 0x1734391C: Vocabulary::Vocabulary(char const*, int, bool) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:15)
==1826472==    by 0x1733B904: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:84)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 309 bytes in 7 blocks are definitely lost in loss record 127 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343C01: Vocabulary::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:272)
==1826472==    by 0x1733B71A: Doc2Vec::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:294)
==1826472==    by 0x1734525A: paragraph2vec_load_model(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:58)
==1826472==    by 0x1734DE3D: _doc2vec_paragraph2vec_load_model (packages/tests-vg/doc2vec/src/RcppExports.cpp:48)
==1826472==    by 0x49CF2F: R_doDotCall (svn/R-devel/src/main/dotcode.c:598)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 960 bytes in 24 blocks are possibly lost in loss record 196 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343632: Vocabulary::createHuffmanTree() (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:169)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 1,144 bytes in 2 blocks are possibly lost in loss record 209 of 2,458
==1826472==    at 0x483B582: operator new[](unsigned long) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:431)
==1826472==    by 0x173406E4: UnWeightedDocument::UnWeightedDocument(Doc2Vec*, TaggedDocument*) (packages/tests-vg/doc2vec/src/doc2vec/TaggedBrownCorpus.cpp:123)
==1826472==    by 0x1734401A: WMD::loadFromDoc2Vec() (packages/tests-vg/doc2vec/src/doc2vec/WMD.cpp:67)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 5,502 bytes in 662 blocks are definitely lost in loss record 303 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343C01: Vocabulary::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:272)
==1826472==    by 0x1733B701: Doc2Vec::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:292)
==1826472==    by 0x1734525A: paragraph2vec_load_model(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:58)
==1826472==    by 0x1734DE3D: _doc2vec_paragraph2vec_load_model (packages/tests-vg/doc2vec/src/RcppExports.cpp:48)
==1826472==    by 0x49CF2F: R_doDotCall (svn/R-devel/src/main/dotcode.c:598)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 7,319 bytes in 662 blocks are definitely lost in loss record 320 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343CA8: Vocabulary::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:280)
==1826472==    by 0x1733B701: Doc2Vec::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:292)
==1826472==    by 0x1734525A: paragraph2vec_load_model(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:58)
==1826472==    by 0x1734DE3D: _doc2vec_paragraph2vec_load_model (packages/tests-vg/doc2vec/src/RcppExports.cpp:48)
==1826472==    by 0x49CF2F: R_doDotCall (svn/R-devel/src/main/dotcode.c:598)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 8,000 bytes in 1 blocks are possibly lost in loss record 1,151 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17340461: TaggedDocument::TaggedDocument() (packages/tests-vg/doc2vec/src/doc2vec/TaggedBrownCorpus.cpp:88)
==1826472==    by 0x17340B7F: TaggedBrownCorpus::TaggedBrownCorpus(char const*, long long, long long) (packages/tests-vg/doc2vec/src/doc2vec/TaggedBrownCorpus.cpp:12)
==1826472==    by 0x1733B637: Doc2Vec::initTrainModelThreads(char const*, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:123)
==1826472==    by 0x1733B99D: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:91)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 14,560 bytes in 91 blocks are possibly lost in loss record 1,199 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343645: Vocabulary::createHuffmanTree() (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:170)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 29,232 bytes in 661 blocks are definitely lost in loss record 1,679 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343C76: Vocabulary::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:278)
==1826472==    by 0x1733B701: Doc2Vec::load(_IO_FILE*) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:292)
==1826472==    by 0x1734525A: paragraph2vec_load_model(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:58)
==1826472==    by 0x1734DE3D: _doc2vec_paragraph2vec_load_model (packages/tests-vg/doc2vec/src/RcppExports.cpp:48)
==1826472==    by 0x49CF2F: R_doDotCall (svn/R-devel/src/main/dotcode.c:598)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 100,400 bytes in 1,004 blocks are possibly lost in loss record 2,144 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17340482: TaggedDocument::TaggedDocument() (packages/tests-vg/doc2vec/src/doc2vec/TaggedBrownCorpus.cpp:89)
==1826472==    by 0x17340B7F: TaggedBrownCorpus::TaggedBrownCorpus(char const*, long long, long long) (packages/tests-vg/doc2vec/src/doc2vec/TaggedBrownCorpus.cpp:12)
==1826472==    by 0x1733B637: Doc2Vec::initTrainModelThreads(char const*, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:123)
==1826472==    by 0x1733B99D: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:91)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 118,577 bytes in 13,315 blocks are definitely lost in loss record 2,180 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17342FAF: Vocabulary::addWordToVocab(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:87)
==1826472==    by 0x1734353A: Vocabulary::loadFromTrainFile(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:67)
==1826472==    by 0x1734391C: Vocabulary::Vocabulary(char const*, int, bool) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:15)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 128,591 bytes in 3,000 blocks are definitely lost in loss record 2,215 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17342FAF: Vocabulary::addWordToVocab(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:87)
==1826472==    by 0x17343497: Vocabulary::loadFromTrainFile(char const*) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:53)
==1826472==    by 0x1734391C: Vocabulary::Vocabulary(char const*, int, bool) (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:15)
==1826472==    by 0x1733B904: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:84)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472== 
==1826472== 323,824 (224 direct, 323,600 indirect) bytes in 4 blocks are definitely lost in loss record 2,365 of 2,458
==1826472==    at 0x483AE7D: operator new(unsigned long) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:342)
==1826472==    by 0x1733B621: Doc2Vec::initTrainModelThreads(char const*, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:123)
==1826472==    by 0x1733B99D: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:91)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 531,160 bytes in 13,279 blocks are definitely lost in loss record 2,403 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343632: Vocabulary::createHuffmanTree() (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:169)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== 799,848 (24,144 direct, 775,704 indirect) bytes in 1,006 blocks are definitely lost in loss record 2,415 of 2,458
==1826472==    at 0x483AE7D: operator new(unsigned long) (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:342)
==1826472==    by 0x17344008: WMD::loadFromDoc2Vec() (packages/tests-vg/doc2vec/src/doc2vec/WMD.cpp:67)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472== 
==1826472== 2,109,120 bytes in 13,182 blocks are definitely lost in loss record 2,435 of 2,458
==1826472==    at 0x483CAE9: calloc (/builddir/build/BUILD/valgrind-3.16.1/coregrind/m_replacemalloc/vg_replace_malloc.c:760)
==1826472==    by 0x17343645: Vocabulary::createHuffmanTree() (packages/tests-vg/doc2vec/src/doc2vec/Vocab.cpp:170)
==1826472==    by 0x1733B8DF: Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) (packages/tests-vg/doc2vec/src/doc2vec/Doc2Vec.cpp:83)
==1826472==    by 0x17345C02: paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) (packages/tests-vg/doc2vec/src/rcpp_doc2vec.cpp:15)
==1826472==    by 0x1734E7E6: _doc2vec_paragraph2vec_train (packages/tests-vg/doc2vec/src/RcppExports.cpp:26)
==1826472==    by 0x49CDC5: R_doDotCall (svn/R-devel/src/main/dotcode.c:645)
==1826472==    by 0x49D3E3: do_dotcall (svn/R-devel/src/main/dotcode.c:1281)
==1826472==    by 0x4D34A6: bcEval (svn/R-devel/src/main/eval.c:7115)
==1826472==    by 0x4EFFB7: Rf_eval (svn/R-devel/src/main/eval.c:727)
==1826472==    by 0x4F19CD: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==1826472==    by 0x4F26C3: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==1826472==    by 0x4DF55D: bcEval (svn/R-devel/src/main/eval.c:7083)
==1826472== 
==1826472== LEAK SUMMARY:
==1826472==    definitely lost: 2,954,310 bytes in 45,784 blocks
==1826472==    indirectly lost: 1,099,304 bytes in 4,002 blocks
==1826472==      possibly lost: 125,237 bytes in 1,126 blocks
==1826472==    still reachable: 1,477,273,702 bytes in 56,490 blocks
==1826472==         suppressed: 0 bytes in 0 blocks
==1826472== Reachable blocks (those to which a pointer was found) are not shown.
==1826472== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1826472== 
==1826472== For lists of detected and suppressed errors, rerun with: -s
==1826472== ERROR SUMMARY: 3041 errors from 22 contexts (suppressed: 0 from 0)
jwijffels commented 3 years ago

gcc ASAN

* using log directory ‘/data/gannet/ripley/R/packages/tests-gcc-SAN/doc2vec.Rcheck’
* using R Under development (unstable) (2020-12-10 r79606)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using option ‘--no-stop-on-test-error’
* checking for file ‘doc2vec/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘doc2vec’ version ‘0.1.0’
* package encoding: UTF-8
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package ‘doc2vec’ can be installed ... [435s/448s] OK
* checking package directory ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking compiled code ... OK
* checking examples ... [15s/16s] ERROR
Running examples in ‘doc2vec-Ex.R’ failed
The error most likely occurred in:

> ### Name: paragraph2vec
> ### Title: Train a paragraph2vec also known as doc2vec model on text
> ### Aliases: paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ ## Take data and standardise it a bit
+ data(belgium_parliament, package = "tokenizers.bpe")
+ str(belgium_parliament)
+ x <- subset(belgium_parliament, language %in% "french")
+ x$text   <- tolower(x$text)
+ x$text   <- gsub("[^[:alpha:]]", " ", x$text)
+ x$text   <- gsub("[[:space:]]+", " ", x$text)
+ x$text   <- trimws(x$text)
+ x$nwords <- txt_count(x$text, pattern = " ")
+ x <- subset(x, nwords < 1000 & nchar(text) > 0)
+ 
+ ## Build the model
+ model <- paragraph2vec(x = x, type = "PV-DM",   dim = 15,  iter = 5)
+ str(model)
+ embedding <- as.matrix(model, which = "words")
+ embedding <- as.matrix(model, which = "docs")
+ head(embedding)
+ 
+ ## Get vocabulary
+ vocab <- summary(model, type = "vocabulary",  which = "docs")
+ vocab <- summary(model, type = "vocabulary",  which = "words")
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
'data.frame':   2000 obs. of  3 variables:
 $ doc_id  : chr  "http://data.dekamer.be/v0/qrva/54-B144-14-1021-2017201819553" "http://data.dekamer.be/v0/qrva/54-B141-4-1075-2017201820260" "http://data.dekamer.be/v0/qrva/54-B143-4-1074-2017201820256" "http://data.dekamer.be/v0/qrva/54-B143-4-1076-2017201820265" ...
 $ text    : chr  "Percentage vrouwen met een eenoudergezin. \n\n In Wallonie werden de eenoudergezinnen onlangs gescreend. Daarui"| __truncated__ "Bescherming van de gegevens van kinderen. \n\n Op 25 mei 2018 zal de Algemene Verordening Gegevensbescherming ("| __truncated__ "Snel breedbandinternet. \n\n In het kader van het Plan voor ultrasnel internet in Belgie hebt u in 2015 uw voor"| __truncated__ "Rapport van UNICEF. - 'Danger in the air'. \n\n UNICEF heeft recent een nieuw rapport over de impact van luchtv"| __truncated__ ...
 $ language: Factor w/ 2 levels "dutch","french": 1 1 1 1 1 1 1 1 1 1 ...
=================================================================
==3097149==ERROR: AddressSanitizer: heap-use-after-free on address 0x6070000066b0 at pc 0x7f40d1fdec29 bp 0x7ffcd7cf89e0 sp 0x7ffcd7cf89d0
READ of size 8 at 0x6070000066b0 thread T0
    #0 0x7f40d1fdec28 in WMD::~WMD() doc2vec/WMD.cpp:27
    #1 0x7f40d1f60bf5 in Doc2Vec::~Doc2Vec() doc2vec/Doc2Vec.cpp:31
    #2 0x7f40d200bc55 in void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:30
    #3 0x7f40d200bc55 in void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:29
    #4 0x7f40d200bc55 in void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:47
    #5 0x7f40d200bc55 in void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:34
    #6 0x7122e9 in R_RunWeakRefFinalizer /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1469
    #7 0x7129da in RunFinalizers /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1536
    #8 0x63cf13 in bc_check_sigint /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:5529
    #9 0x63cf13 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:6723
    #10 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #11 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #12 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #13 0x72047e in dispatchMethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:436
    #14 0x7211b4 in Rf_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:486
    #15 0x721ab3 in do_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:565
    #16 0x62370c in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135
    #17 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #18 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #19 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #20 0x646bbe in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083
    #21 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #22 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #23 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #24 0x670a3f in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:850
    #25 0x4f48cc in do_docall /data/gannet/ripley/R/svn/R-devel/src/main/coerce.c:2715
    #26 0x62845b in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7115
    #27 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #28 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #29 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #30 0x646bbe in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083
    #31 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #32 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #33 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #34 0x646bbe in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083
    #35 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #36 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #37 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #38 0x724343 in do_nextmethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:915
    #39 0x71aac9 in do_internal /data/gannet/ripley/R/svn/R-devel/src/main/names.c:1397
    #40 0x62370c in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135
    #41 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #42 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #43 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #44 0x646bbe in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083
    #45 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #46 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #47 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #48 0x72047e in dispatchMethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:436
    #49 0x720f7f in Rf_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:476
    #50 0x721ab3 in do_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:565
    #51 0x62370c in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135
    #52 0x670177 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727
    #53 0x675464 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1897
    #54 0x677907 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823
    #55 0x670a3f in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:850
    #56 0x678e56 in do_begin /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2517
    #57 0x670e68 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802
    #58 0x670e68 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802
    #59 0x6efb2d in Rf_ReplIteration /data/gannet/ripley/R/svn/R-devel/src/main/main.c:264
    #60 0x6f0178 in R_ReplConsole /data/gannet/ripley/R/svn/R-devel/src/main/main.c:314
    #61 0x6f02c4 in run_Rmainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1113
    #62 0x6f0312 in Rf_mainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1120
    #63 0x41b3f8 in main /data/gannet/ripley/R/svn/R-devel/src/main/Rmain.c:29
    #64 0x7f40e2a46041 in __libc_start_main (/lib64/libc.so.6+0x27041)
    #65 0x41db6d in _start (/data/gannet/ripley/R/gcc-SAN/bin/exec/R+0x41db6d)

0x6070000066b0 is located 48 bytes inside of 80-byte region [0x607000006680,0x6070000066d0)
freed by thread T0 here:
    #0 0x7f40e413fb87 in operator delete(void*) (/lib64/libasan.so.6+0xb2b87)
    #1 0x7f40d1f60b8b in Doc2Vec::~Doc2Vec() doc2vec/Doc2Vec.cpp:30
    #2 0x7f40d200bc55 in void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:30
    #3 0x7f40d200bc55 in void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:29
    #4 0x7f40d200bc55 in void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:47
    #5 0x7f40d200bc55 in void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) /data/gannet/ripley/R/test-4.1/Rcpp/include/Rcpp/XPtr.h:34
    #6 0x7122e9 in R_RunWeakRefFinalizer /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1469

previously allocated by thread T0 here:
    #0 0x7f40e413f067 in operator new(unsigned long) (/lib64/libasan.so.6+0xb2067)
    #1 0x7f40d1f68759 in Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) doc2vec/Doc2Vec.cpp:85
    #2 0x7f40d1fe9b1a in paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) /data/gannet/ripley/R/packages/tests-gcc-SAN/doc2vec/src/rcpp_doc2vec.cpp:15
    #3 0x7f40d203ef11 in _doc2vec_paragraph2vec_train /data/gannet/ripley/R/packages/tests-gcc-SAN/doc2vec/src/RcppExports.cpp:26
    #4 0x57ca09 in R_doDotCall /data/gannet/ripley/R/svn/R-devel/src/main/dotcode.c:645

SUMMARY: AddressSanitizer: heap-use-after-free doc2vec/WMD.cpp:27 in WMD::~WMD()
Shadow bytes around the buggy address:
  0x0c0e7fff8c80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff8c90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff8ca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff8cb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff8cc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c0e7fff8cd0: fd fd fd fd fd fd[fd]fd fd fd fa fa fa fa 00 00
  0x0c0e7fff8ce0: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00
  0x0c0e7fff8cf0: 00 00 00 00 00 fa fa fa fa fa 00 00 00 00 00 00
  0x0c0e7fff8d00: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c0e7fff8d10: 00 00 fa fa fa fa 00 00 00 00 00 00 00 00 00 fa
  0x0c0e7fff8d20: fa fa fa fa 00 00 00 00 00 00 00 00 00 fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==3097149==ABORTING
* DONE
Status: 1 ERROR
jwijffels commented 3 years ago

clang-ASAN

* using log directory ‘/data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec.Rcheck’
* using R Under development (unstable) (2020-12-10 r79607)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using option ‘--no-stop-on-test-error’
* checking for file ‘doc2vec/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘doc2vec’ version ‘0.1.0’
* package encoding: UTF-8
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package ‘doc2vec’ can be installed ... [289s/341s] OK
* checking package directory ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking compiled code ... OK
* checking examples ... [18s/21s] ERROR
Running examples in ‘doc2vec-Ex.R’ failed
The error most likely occurred in:

> ### Name: paragraph2vec
> ### Title: Train a paragraph2vec also known as doc2vec model on text
> ### Aliases: paragraph2vec
> 
> ### ** Examples
> 
> ## Don't show: 
> if(require(tokenizers.bpe) & require(udpipe)){
+ ## End(Don't show)
+ library(tokenizers.bpe)
+ library(udpipe)
+ ## Take data and standardise it a bit
+ data(belgium_parliament, package = "tokenizers.bpe")
+ str(belgium_parliament)
+ x <- subset(belgium_parliament, language %in% "french")
+ x$text   <- tolower(x$text)
+ x$text   <- gsub("[^[:alpha:]]", " ", x$text)
+ x$text   <- gsub("[[:space:]]+", " ", x$text)
+ x$text   <- trimws(x$text)
+ x$nwords <- txt_count(x$text, pattern = " ")
+ x <- subset(x, nwords < 1000 & nchar(text) > 0)
+ 
+ ## Build the model
+ model <- paragraph2vec(x = x, type = "PV-DM",   dim = 15,  iter = 5)
+ str(model)
+ embedding <- as.matrix(model, which = "words")
+ embedding <- as.matrix(model, which = "docs")
+ head(embedding)
+ 
+ ## Get vocabulary
+ vocab <- summary(model, type = "vocabulary",  which = "docs")
+ vocab <- summary(model, type = "vocabulary",  which = "words")
+ ## Don't show: 
+ } # End of main if statement running only if the required packages are installed
Loading required package: tokenizers.bpe
Loading required package: udpipe
'data.frame':   2000 obs. of  3 variables:
 $ doc_id  : chr  "http://data.dekamer.be/v0/qrva/54-B144-14-1021-2017201819553" "http://data.dekamer.be/v0/qrva/54-B141-4-1075-2017201820260" "http://data.dekamer.be/v0/qrva/54-B143-4-1074-2017201820256" "http://data.dekamer.be/v0/qrva/54-B143-4-1076-2017201820265" ...
 $ text    : chr  "Percentage vrouwen met een eenoudergezin. \n\n In Wallonie werden de eenoudergezinnen onlangs gescreend. Daarui"| __truncated__ "Bescherming van de gegevens van kinderen. \n\n Op 25 mei 2018 zal de Algemene Verordening Gegevensbescherming ("| __truncated__ "Snel breedbandinternet. \n\n In het kader van het Plan voor ultrasnel internet in Belgie hebt u in 2015 uw voor"| __truncated__ "Rapport van UNICEF. - 'Danger in the air'. \n\n UNICEF heeft recent een nieuw rapport over de impact van luchtv"| __truncated__ ...
 $ language: Factor w/ 2 levels "dutch","french": 1 1 1 1 1 1 1 1 1 1 ...
=================================================================
==109612==ERROR: AddressSanitizer: heap-use-after-free on address 0x60700000c060 at pc 0x7f13354accd5 bp 0x7fffd83deb60 sp 0x7fffd83deb58
READ of size 8 at 0x60700000c060 thread T0
    #0 0x7f13354accd4 in WMD::~WMD() /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/doc2vec/WMD.cpp:27:48
    #1 0x7f13354690a6 in Doc2Vec::~Doc2Vec() /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/doc2vec/Doc2Vec.cpp:31:13
    #2 0x7f13354be4db in void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*) /data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/XPtr.h:30:5
    #3 0x7f13354be4db in void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) /data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/XPtr.h:47:5
    #4 0x9726ce in R_RunWeakRefFinalizer /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1469:2
    #5 0x974454 in RunFinalizers /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1536:3
    #6 0x834ed9 in bc_check_sigint /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:5529:5
    #7 0x834ed9 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:6723:4
    #8 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #9 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #10 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #11 0x9b7284 in dispatchMethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:436:16
    #12 0x9b5d8d in Rf_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:486:9
    #13 0x9b83e3 in do_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:565:9
    #14 0x849a27 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135:15
    #15 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #16 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #17 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #18 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #19 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #20 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #21 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #22 0x82be62 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:850:12
    #23 0x6185af in do_docall /data/gannet/ripley/R/svn/R-devel/src/main/coerce.c:2715:12
    #24 0x84c7c5 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7115:14
    #25 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #26 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #27 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #28 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #29 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #30 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #31 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #32 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #33 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #34 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #35 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #36 0x9bc0e5 in do_nextmethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:915:11
    #37 0x9adb80 in do_internal /data/gannet/ripley/R/svn/R-devel/src/main/names.c:1397:11
    #38 0x849a27 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135:15
    #39 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #40 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #41 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #42 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #43 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #44 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #45 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #46 0x9b7284 in dispatchMethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:436:16
    #47 0x9b60bf in Rf_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:476:10
    #48 0x9b83e3 in do_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:565:9
    #49 0x849a27 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135:15
    #50 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #51 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #52 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #53 0x82be62 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:850:12
    #54 0x8a1172 in do_begin /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2517:10
    #55 0x82b67c in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802:12
    #56 0x82b67c in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802:12
    #57 0x961f97 in Rf_ReplIteration /data/gannet/ripley/R/svn/R-devel/src/main/main.c:264:2
    #58 0x965370 in R_ReplConsole /data/gannet/ripley/R/svn/R-devel/src/main/main.c:314:11
    #59 0x965199 in run_Rmainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1113:5
    #60 0x9654f2 in Rf_mainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1120:5
    #61 0x4de25a in main /data/gannet/ripley/R/svn/R-devel/src/main/Rmain.c:29:5
    #62 0x7f1346219041 in __libc_start_main (/lib64/libc.so.6+0x27041)
    #63 0x43130d in _start (/data/gannet/ripley/R/R-clang-SAN/bin/exec/R+0x43130d)

0x60700000c060 is located 48 bytes inside of 80-byte region [0x60700000c030,0x60700000c080)
freed by thread T0 here:
    #0 0x4dc1ad in operator delete(void*) /data/gannet/ripley/Sources2/LLVM/11.0.0/llvm-project-11.0.0/compiler-rt/lib/asan/asan_new_delete.cpp:160:3
    #1 0x7f133546906b in Doc2Vec::~Doc2Vec() /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/doc2vec/Doc2Vec.cpp:30:12
    #2 0x7f13354be4db in void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*) /data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/XPtr.h:30:5
    #3 0x7f13354be4db in void Rcpp::finalizer_wrapper<Doc2Vec, &(void Rcpp::standard_delete_finalizer<Doc2Vec>(Doc2Vec*))>(SEXPREC*) /data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/XPtr.h:47:5
    #4 0x9726ce in R_RunWeakRefFinalizer /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1469:2
    #5 0x974454 in RunFinalizers /data/gannet/ripley/R/svn/R-devel/src/main/memory.c:1536:3
    #6 0x834ed9 in bc_check_sigint /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:5529:5
    #7 0x834ed9 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:6723:4
    #8 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #9 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #10 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #11 0x9b7284 in dispatchMethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:436:16
    #12 0x9b5d8d in Rf_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:486:9
    #13 0x9b83e3 in do_usemethod /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:565:9
    #14 0x849a27 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7135:15
    #15 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #16 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #17 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #18 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #19 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #20 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #21 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #22 0x82be62 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:850:12
    #23 0x6185af in do_docall /data/gannet/ripley/R/svn/R-devel/src/main/coerce.c:2715:12
    #24 0x84c7c5 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7115:14
    #25 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #26 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #27 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #28 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #29 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #30 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #31 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16

previously allocated by thread T0 here:
    #0 0x4db94d in operator new(unsigned long) /data/gannet/ripley/Sources2/LLVM/11.0.0/llvm-project-11.0.0/compiler-rt/lib/asan/asan_new_delete.cpp:99:3
    #1 0x7f1335469e08 in Doc2Vec::train(char const*, int, int, int, int, int, int, float, float, int, int, int) /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/doc2vec/Doc2Vec.cpp:85:10
    #2 0x7f13354b098a in paragraph2vec_train(char const*, int, int, int, int, int, int, double, double, int, int, int) /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/rcpp_doc2vec.cpp:15:10
    #3 0x7f13354daba9 in _doc2vec_paragraph2vec_train /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/RcppExports.cpp:26:34
    #4 0x6dfed8 in R_doDotCall /data/gannet/ripley/R/svn/R-devel/src/main/dotcode.c:645:17
    #5 0x72ed31 in do_dotcall /data/gannet/ripley/R/svn/R-devel/src/main/dotcode.c:1281:11
    #6 0x84c7c5 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7115:14
    #7 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #8 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #9 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #10 0x84f397 in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7083:12
    #11 0x82b029 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:727:8
    #12 0x895750 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
    #13 0x890977 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1823:16
    #14 0x82be62 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:850:12
    #15 0x8a21b9 in do_set /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2969:8
    #16 0x82b67c in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802:12
    #17 0x8a1172 in do_begin /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2517:10
    #18 0x82b67c in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802:12
    #19 0x82b67c in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:802:12
    #20 0x961f97 in Rf_ReplIteration /data/gannet/ripley/R/svn/R-devel/src/main/main.c:264:2
    #21 0x965370 in R_ReplConsole /data/gannet/ripley/R/svn/R-devel/src/main/main.c:314:11
    #22 0x965199 in run_Rmainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1113:5
    #23 0x9654f2 in Rf_mainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1120:5
    #24 0x4de25a in main /data/gannet/ripley/R/svn/R-devel/src/main/Rmain.c:29:5
    #25 0x7f1346219041 in __libc_start_main (/lib64/libc.so.6+0x27041)

SUMMARY: AddressSanitizer: heap-use-after-free /data/gannet/ripley/R/packages/tests-clang-SAN/doc2vec/src/doc2vec/WMD.cpp:27:48 in WMD::~WMD()
Shadow bytes around the buggy address:
  0x0c0e7fff97b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff97c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff97d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff97e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0e7fff97f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c0e7fff9800: fa fa fa fa fa fa fd fd fd fd fd fd[fd]fd fd fd
  0x0c0e7fff9810: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 fa fa
  0x0c0e7fff9820: fa fa 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c0e7fff9830: 00 00 00 00 00 00 00 00 00 fa fa fa fa fa 00 00
  0x0c0e7fff9840: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00
  0x0c0e7fff9850: 00 00 00 00 00 fa fa fa fa fa 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==109612==ABORTING
* DONE
Status: 1 ERROR
jwijffels commented 3 years ago

not reproducible on rhub

jwijffels commented 3 years ago

but can reproduce it on local ubuntu server using valgrind

jwijffels commented 3 years ago

Got a boom today

Error in `/usr/lib/R/bin/exec/R': free(): invalid next size (fast): 0x000000 025bb88e10 ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7f2727fe37f5] /lib/x86_64-linux-gnu/libc.so.6(+0x8038a)[0x7f2727fec38a] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f2727ff058c] /data/home/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so(_ZN14Ta ggedDocumentD1Ev+0x29)[0x7f2709e8dae9] /data/home/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so(_Z19par agraph2vec_inferP7SEXPRECN4Rcpp6VectorILi19ENS1_15PreserveStorageEEE+0x3e5)[0x7f 2709e97785] /data/home/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so(_doc2ve c_paragraph2vec_infer+0x6b)[0x7f2709e9f36b] /usr/lib/R/lib/libR.so(+0xd2c9c)[0x7f2728625c9c] /usr/lib/R/lib/libR.so(Rf_eval+0x7bd)[0x7f272866320d] /usr/lib/R/lib/libR.so(+0x112cae)[0x7f2728665cae] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(+0x11200f)[0x7f272866500f] /usr/lib/R/lib/libR.so(Rf_eval+0x366)[0x7f2728662db6] /usr/lib/R/lib/libR.so(+0x113e16)[0x7f2728666e16] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(+0x112cae)[0x7f2728665cae] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(+0x112cae)[0x7f2728665cae] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(+0x112cae)[0x7f2728665cae] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(+0x11200f)[0x7f272866500f] /usr/lib/R/lib/libR.so(+0x1470a8)[0x7f272869a0a8] /usr/lib/R/lib/libR.so(+0x1474bb)[0x7f272869a4bb] /usr/lib/R/lib/libR.so(+0x1477b8)[0x7f272869a7b8] /usr/lib/R/lib/libR.so(+0x103916)[0x7f2728656916] /usr/lib/R/lib/libR.so(Rf_eval+0x198)[0x7f2728662be8] /usr/lib/R/lib/libR.so(+0x11200f)[0x7f272866500f] /usr/lib/R/lib/libR.so(+0x107520)[0x7f272865a520] /usr/lib/R/lib/libR.so(Rf_eval+0x198)[0x7f2728662be8] /usr/lib/R/lib/libR.so(+0x11200f)[0x7f272866500f] /usr/lib/R/lib/libR.so(Rf_eval+0x366)[0x7f2728662db6] /usr/lib/R/lib/libR.so(+0x1130f6)[0x7f27286660f6] /usr/lib/R/lib/libR.so(Rf_eval+0x59c)[0x7f2728662fec] /usr/lib/R/lib/libR.so(Rf_ReplIteration+0x222)[0x7f272868c632] /usr/lib/R/lib/libR.so(+0x139a31)[0x7f272868ca31] /usr/lib/R/lib/libR.so(run_Rmainloop+0x48)[0x7f272868cae8] /usr/lib/R/bin/exec/R(main+0x1b)[0x4007cb] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f2727f8c840] /usr/lib/R/bin/exec/R(_start+0x29)[0x400809] ======= Memory map: ======== 00400000-00401000 r-xp 00000000 08:01 1280783 /usr/li b/R/bin/exec/R 00600000-00601000 r--p 00000000 08:01 1280783 /usr/li b/R/bin/exec/R 00601000-00602000 rw-p 00001000 08:01 1280783 /usr/li b/R/bin/exec/R 0100a000-2e6393000 rw-p 00000000 00:00 0 [heap] 7f261415f000-7f2624ee4000 rw-p 00000000 00:00 0 7f2698699000-7f26d1e5d000 rw-p 00000000 00:00 0 7f26d8000000-7f26d8021000 rw-p 00000000 00:00 0 7f26d8021000-7f26dc000000 ---p 00000000 00:00 0 7f26dd482000-7f26fb8e7000 rw-p 00000000 00:00 0 7f26fe0e8000-7f26fe0e9000 ---p 00000000 00:00 0 7f26fe0e9000-7f26ff4b0000 rw-p 00000000 00:00 0 7f26ff4b0000-7f26ff4b1000 ---p 00000000 00:00 0 7f26ff4b1000-7f2700167000 rw-p 00000000 00:00 0 7f2700568000-7f2700a1e000 rw-p 00000000 00:00 0 7f2700c1f000-7f27052cd000 rw-p 00000000 00:00 0 7f27052cd000-7f270547e000 r-xp 00000000 08:01 2726 /usr/li b/x86_64-linux-gnu/libxml2.so.2.9.3 7f270547e000-7f270567d000 ---p 001b1000 08:01 2726 /usr/li b/x86_64-linux-gnu/libxml2.so.2.9.3 7f270567d000-7f2705685000 r--p 001b0000 08:01 2726 /usr/li b/x86_64-linux-gnu/libxml2.so.2.9.3 7f2705685000-7f2705687000 rw-p 001b8000 08:01 2726 /usr/li b/x86_64-linux-gnu/libxml2.so.2.9.3 7f2705687000-7f2705688000 rw-p 00000000 00:00 0 7f2705688000-7f27056d2000 r-xp 00000000 08:01 3074865 /usr/lo cal/lib/R/site-library/xml2/libs/xml2.so 7f27056d2000-7f27058d2000 ---p 0004a000 08:01 3074865 /usr/lo cal/lib/R/site-library/xml2/libs/xml2.so 7f27058d2000-7f27058d3000 r--p 0004a000 08:01 3074865 /usr/lo cal/lib/R/site-library/xml2/libs/xml2.so 7f27058d3000-7f27058d4000 rw-p 0004b000 08:01 3074865 /usr/lo cal/lib/R/site-library/xml2/libs/xml2.so 7f27058d4000-7f27058d6000 rw-p 00000000 00:00 0 7f27058d6000-7f270591d000 r-xp 00000000 08:21 22152185 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/word2vec/libs/word2vec.so 7f270591d000-7f2705b1c000 ---p 00047000 08:21 22152185 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/word2vec/libs/word2vec.so 7f2705b1c000-7f2705b1e000 r--p 00046000 08:21 22152185 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/word2vec/libs/word2vec.so 7f2705b1e000-7f2705b1f000 rw-p 00048000 08:21 22152185 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/word2vec/libs/word2vec.so 7f2705b1f000-7f2705b20000 rw-p 00000000 00:00 0 7f2705b20000-7f2705b23000 r-xp 00000000 08:01 3072561 /usr/li b/R/site-library/uuid/libs/uuid.so 7f2705b23000-7f2705d22000 ---p 00003000 08:01 3072561 /usr/li b/R/site-library/uuid/libs/uuid.so 7f2705d22000-7f2705d23000 r--p 00002000 08:01 3072561 /usr/li b/R/site-library/uuid/libs/uuid.so 7f2705d23000-7f2705d24000 rw-p 00003000 08:01 3072561 /usr/li b/R/site-library/uuid/libs/uuid.so 7f2705d24000-7f2705d66000 r-xp 00000000 08:21 22154071 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/urltools/libs/urltools.so 7f2705d66000-7f2705f66000 ---p 00042000 08:21 22154071 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/urltools/libs/urltools.so 7f2705f66000-7f2705f67000 r--p 00042000 08:21 22154071 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/urltools/libs/urltools.so 7f2705f67000-7f2705f68000 rw-p 00043000 08:21 22154071 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/urltools/libs/urltools.so 7f2705f68000-7f2705f6c000 rw-p 00000000 00:00 0 7f2705f6c000-7f2705fb2000 r-xp 00000000 08:21 22154028 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/triebeard/libs/triebeard.so 7f2705fb2000-7f27061b1000 ---p 00046000 08:21 22154028 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/triebeard/libs/triebeard.so 7f27061b1000-7f27061b2000 r--p 00045000 08:21 22154028 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/triebeard/libs/triebeard.so 7f27061b2000-7f27061b3000 rw-p 00046000 08:21 22154028 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/triebeard/libs/triebeard.so 7f27061b3000-7f27061b4000 rw-p 00000000 00:00 0 7f27061b4000-7f27061d2000 r-xp 00000000 08:21 22152150 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/text.alignment/libs/text.alignment.so 7f27061d2000-7f27063d2000 ---p 0001e000 08:21 22152150 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/text.alignment/libs/text.alignment.so 7f27063d2000-7f27063d3000 r--p 0001e000 08:21 22152150 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/text.alignment/libs/text.alignment.so 7f27063d3000-7f27063d4000 rw-p 0001f000 08:21 22152150 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/text.alignment/libs/text.alignment.so 7f27063d4000-7f2707c8a000 r-xp 00000000 08:01 30388 /usr/li b/x86_64-linux-gnu/libicudata.so.55.1 7f2707c8a000-7f2707e89000 ---p 018b6000 08:01 30388 /usr/li b/x86_64-linux-gnu/libicudata.so.55.1 7f2707e89000-7f2707e8a000 r--p 018b5000 08:01 30388 /usr/li b/x86_64-linux-gnu/libicudata.so.55.1 7f2707e8a000-7f2707e8b000 rw-p 018b6000 08:01 30388 /usr/li b/x86_64-linux-gnu/libicudata.so.55.1 7f2707e8b000-7f270800a000 r-xp 00000000 08:01 30376 /usr/li b/x86_64-linux-gnu/libicuuc.so.55.1 7f270800a000-7f270820a000 ---p 0017f000 08:01 30376 /usr/li b/x86_64-linux-gnu/libicuuc.so.55.1 7f270820a000-7f270821a000 r--p 0017f000 08:01 30376 /usr/li b/x86_64-linux-gnu/libicuuc.so.55.1 7f270821a000-7f270821b000 rw-p 0018f000 08:01 30376 /usr/li b/x86_64-linux-gnu/libicuuc.so.55.1 7f270821b000-7f270821f000 rw-p 00000000 00:00 0 7f270821f000-7f2708471000 r-xp 00000000 08:01 30384 /usr/li b/x86_64-linux-gnu/libicui18n.so.55.1 7f2708471000-7f2708671000 ---p 00252000 08:01 30384 /usr/li b/x86_64-linux-gnu/libicui18n.so.55.1 7f2708671000-7f2708680000 r--p 00252000 08:01 30384 /usr/li b/x86_64-linux-gnu/libicui18n.so.55.1 7f2708680000-7f2708681000 rw-p 00261000 08:01 30384 /usr/li b/x86_64-linux-gnu/libicui18n.so.55.1 7f2708681000-7f27086ee000 r-xp 00000000 08:01 3074316 /usr/lo cal/lib/R/site-library/stringi/libs/stringi.so 7f27086ee000-7f27088ee000 ---p 0006d000 08:01 3074316 /usr/lo cal/lib/R/site-library/stringi/libs/stringi.so 7f27088ee000-7f27088f0000 r--p 0006d000 08:01 3074316 /usr/lo cal/lib/R/site-library/stringi/libs/stringi.so 7f27088f0000-7f27088f1000 rw-p 0006f000 08:01 3074316 /usr/lo cal/lib/R/site-library/stringi/libs/stringi.so 7f27088f1000-7f2708935000 r-xp 00000000 08:01 3072413 /usr/lo cal/lib/R/site-library/glmnet/libs/glmnet.so 7f2708935000-7f2708b34000 ---p 00044000 08:01 3072413 /usr/lo cal/lib/R/site-library/glmnet/libs/glmnet.so 7f2708b34000-7f2708b35000 r--p 00043000 08:01 3072413 /usr/lo cal/lib/R/site-library/glmnet/libs/glmnet.so 7f2708b35000-7f2708b36000 rw-p 00044000 08:01 3072413 /usr/lo cal/lib/R/site-library/glmnet/libs/glmnet.so 7f2708b36000-7f2708d06000 r-xp 00000000 08:21 22283829 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/cld2/libs/cld2.so 7f2708d06000-7f2708f05000 ---p 001d0000 08:21 22283829 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/cld2/libs/cld2.so 7f2708f05000-7f2708f12000 r--p 001cf000 08:21 22283829 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/cld2/libs/cld2.so 7f2708f12000-7f2708f13000 rw-p 001dc000 08:21 22283829 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/cld2/libs/cld2.so 7f2708f13000-7f2708f2e000 r-xp 00000000 08:01 25871 /usr/li b/x86_64-linux-gnu/librtmp.so.1 7f2708f2e000-7f270912d000 ---p 0001b000 08:01 25871 /usr/li b/x86_64-linux-gnu/librtmp.so.1 7f270912d000-7f270912e000 r--p 0001a000 08:01 25871 /usr/li b/x86_64-linux-gnu/librtmp.so.1 7f270912e000-7f270912f000 rw-p 0001b000 08:01 25871 /usr/li b/x86_64-linux-gnu/librtmp.so.1 7f270912f000-7f270919b000 r-xp 00000000 08:01 21184 /usr/li b/x86_64-linux-gnu/libcurl.so.4.4.0 7f270919b000-7f270939a000 ---p 0006c000 08:01 21184 /usr/li b/x86_64-linux-gnu/libcurl.so.4.4.0 7f270939a000-7f270939d000 r--p 0006b000 08:01 21184 /usr/li b/x86_64-linux-gnu/libcurl.so.4.4.0 7f270939d000-7f270939e000 rw-p 0006e000 08:01 21184 /usr/li b/x86_64-linux-gnu/libcurl.so.4.4.0 7f270939e000-7f27093a9000 r-xp 00000000 08:21 28444170 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/RCurl/libs/RCurl.so 7f27093a9000-7f27095a9000 ---p 0000b000 08:21 28444170 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/RCurl/libs/RCurl.so 7f27095a9000-7f27095aa000 r--p 0000b000 08:21 28444170 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/RCurl/libs/RCurl.so 7f27095aa000-7f27095ae000 rw-p 0000c000 08:21 28444170 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/RCurl/libs/RCurl.so 7f27095ae000-7f27095b0000 r-xp 00000000 08:01 3075214 /usr/lo cal/lib/R/site-library/bitops/libs/bitops.so 7f27095b0000-7f27097b0000 ---p 00002000 08:01 3075214 /usr/lo cal/lib/R/site-library/bitops/libs/bitops.so 7f27097b0000-7f27097b1000 r--p 00002000 08:01 3075214 /usr/lo cal/lib/R/site-library/bitops/libs/bitops.so 7f27097b1000-7f27097b2000 rw-p 00003000 08:01 3075214 /usr/lo cal/lib/R/site-library/bitops/libs/bitops.so 7f27097b2000-7f2709823000 r-xp 00000000 08:21 23593750 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/ruimtehol/libs/ruimtehol.so 7f2709823000-7f2709a23000 ---p 00071000 08:21 23593750 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/ruimtehol/libs/ruimtehol.so 7f2709a23000-7f2709a25000 r--p 00071000 08:21 23593750 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/ruimtehol/libs/ruimtehol.so 7f2709a25000-7f2709a26000 rw-p 00073000 08:21 23593750 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/ruimtehol/libs/ruimtehol.so 7f2709a26000-7f2709a28000 rw-p 00000000 00:00 0 7f2709a28000-7f2709a63000 r-xp 00000000 08:21 27919916 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/crfsuite/libs/crfsuite.so 7f2709a63000-7f2709c63000 ---p 0003b000 08:21 27919916 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/crfsuite/libs/crfsuite.so 7f2709c63000-7f2709c64000 r--p 0003b000 08:21 27919916 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/crfsuite/libs/crfsuite.so 7f2709c64000-7f2709c65000 rw-p 0003c000 08:21 27919916 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/crfsuite/libs/crfsuite.so 7f2709c65000-7f2709c79000 r-xp 00000000 08:01 1280692 /usr/li b/R/library/tools/libs/tools.so 7f2709c79000-7f2709e79000 ---p 00014000 08:01 1280692 /usr/li b/R/library/tools/libs/tools.so 7f2709e79000-7f2709e7a000 r--p 00014000 08:01 1280692 /usr/li b/R/library/tools/libs/tools.so 7f2709e7a000-7f2709e7b000 rw-p 00015000 08:01 1280692 /usr/li b/R/library/tools/libs/tools.so 7f2709e7b000-7f2709eab000 r-xp 00000000 08:21 23593820 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so 7f2709eab000-7f270a0ab000 ---p 00030000 08:21 23593820 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so 7f270a0ab000-7f270a0ac000 r--p 00030000 08:21 23593820 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so 7f270a0ac000-7f270a0ad000 rw-p 00031000 08:21 23593820 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/doc2vec/libs/doc2vec.so 7f270a0ad000-7f270a0ae000 rw-p 00000000 00:00 0 7f270a0ae000-7f270a2c3000 r-xp 00000000 08:21 23593024 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/sentencepiece/libs/sentencepiece.so 7f270a2c3000-7f270a4c3000 ---p 00215000 08:21 23593024 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/sentencepiece/libs/sentencepiece.so 7f270a4c3000-7f270a4c6000 r--p 00215000 08:21 23593024 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/sentencepiece/libs/sentencepiece.so 7f270a4c6000-7f270a4c7000 rw-p 00218000 08:21 23593024 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/sentencepiece/libs/sentencepiece.so 7f270a4c7000-7f270a4cc000 rw-p 00000000 00:00 0 7f270a4cc000-7f270a4da000 r-xp 00000000 08:01 3072517 /usr/li b/R/site-library/jsonlite/libs/jsonlite.so 7f270a4da000-7f270a6da000 ---p 0000e000 08:01 3072517 /usr/li b/R/site-library/jsonlite/libs/jsonlite.so 7f270a6da000-7f270a6db000 r--p 0000e000 08:01 3072517 /usr/li b/R/site-library/jsonlite/libs/jsonlite.so 7f270a6db000-7f270a6dc000 rw-p 0000f000 08:01 3072517 /usr/li b/R/site-library/jsonlite/libs/jsonlite.so 7f270a6dc000-7f270a6f6000 r-xp 00000000 08:01 3073723 /usr/li b/R/site-library/digest/libs/digest.so 7f270a6f6000-7f270a8f5000 ---p 0001a000 08:01 3073723 /usr/li b/R/site-library/digest/libs/digest.so 7f270a8f5000-7f270a8f6000 r--p 00019000 08:01 3073723 /usr/li b/R/site-library/digest/libs/digest.so 7f270a8f6000-7f270a8f7000 rw-p 0001a000 08:01 3073723 /usr/li b/R/site-library/digest/libs/digest.so 7f270a8f7000-7f270a8f9000 rw-p 00000000 00:00 0 7f270a8f9000-7f270a900000 r-xp 00000000 08:01 25741 /usr/li b/x86_64-linux-gnu/libffi.so.6.0.4 7f270a900000-7f270aaff000 ---p 00007000 08:01 25741 /usr/li b/x86_64-linux-gnu/libffi.so.6.0.4 7f270aaff000-7f270ab00000 r--p 00006000 08:01 25741 /usr/li b/x86_64-linux-gnu/libffi.so.6.0.4 7f270ab00000-7f270ab01000 rw-p 00007000 08:01 25741 /usr/li b/x86_64-linux-gnu/libffi.so.6.0.4 7f270ab01000-7f270ab0a000 r-xp 00000000 08:01 55471 /lib/x8 6_64-linux-gnu/libcrypt-2.23.so 7f270ab0a000-7f270ad09000 ---p 00009000 08:01 55471 /lib/x8 6_64-linux-gnu/libcrypt-2.23.so 7f270ad09000-7f270ad0a000 r--p 00008000 08:01 55471 /lib/x8 6_64-linux-gnu/libcrypt-2.23.so 7f270ad0a000-7f270ad0b000 rw-p 00009000 08:01 55471 /lib/x8 6_64-linux-gnu/libcrypt-2.23.so 7f270ad0b000-7f270ad39000 rw-p 00000000 00:00 0 7f270ad39000-7f270ae09000 r-xp 00000000 08:01 71155 /usr/li b/x86_64-linux-gnu/libsqlite3.so.0.8.6 7f270ae09000-7f270b008000 ---p 000d0000 08:01 71155 /usr/li b/x86_64-linux-gnu/libsqlite3.so.0.8.6 7f270b008000-7f270b00b000 r--p 000cf000 08:01 71155 /usr/li b/x86_64-linux-gnu/libsqlite3.so.0.8.6 7f270b00b000-7f270b00d000 rw-p 000d2000 08:01 71155 /usr/li b/x86_64-linux-gnu/libsqlite3.so.0.8.6 7f270b00d000-7f270b00e000 rw-p 00000000 00:00 0 7f270b00e000-7f270b055000 r-xp 00000000 08:01 25854 /usr/li b/x86_64-linux-gnu/libhx509.so.5.0.0 7f270b055000-7f270b254000 ---p 00047000 08:01 25854 /usr/li b/x86_64-linux-gnu/libhx509.so.5.0.0 7f270b254000-7f270b256000 r--p 00046000 08:01 25854 /usr/li b/x86_64-linux-gnu/libhx509.so.5.0.0 7f270b256000-7f270b258000 rw-p 00048000 08:01 25854 /usr/li b/x86_64-linux-gnu/libhx509.so.5.0.0 7f270b258000-7f270b259000 rw-p 00000000 00:00 0 7f270b259000-7f270b267000 r-xp 00000000 08:01 25849 /usr/li b/x86_64-linux-gnu/libheimbase.so.1.0.0 7f270b267000-7f270b466000 ---p 0000e000 08:01 25849 /usr/li b/x86_64-linux-gnu/libheimbase.so.1.0.0 7f270b466000-7f270b467000 r--p 0000d000 08:01 25849 /usr/li b/x86_64-linux-gnu/libheimbase.so.1.0.0 7f270b467000-7f270b468000 rw-p 0000e000 08:01 25849 /usr/li b/x86_64-linux-gnu/libheimbase.so.1.0.0 7f270b468000-7f270b48f000 r-xp 00000000 08:01 25852 /usr/li b/x86_64-linux-gnu/libwind.so.0.0.0 7f270b48f000-7f270b68f000 ---p 00027000 08:01 25852 /usr/li b/x86_64-linux-gnu/libwind.so.0.0.0 7f270b68f000-7f270b690000 r--p 00027000 08:01 25852 /usr/li b/x86_64-linux-gnu/libwind.so.0.0.0 7f270b690000-7f270b691000 rw-p 00028000 08:01 25852 /usr/li b/x86_64-linux-gnu/libwind.so.0.0.0 7f270b691000-7f270b710000 r-xp 00000000 08:01 25745 /usr/li b/x86_64-linux-gnu/libgmp.so.10.3.0 7f270b710000-7f270b90f000 ---p 0007f000 08:01 25745 /usr/li b/x86_64-linux-gnu/libgmp.so.10.3.0 7f270b90f000-7f270b910000 r--p 0007e000 08:01 25745 /usr/li b/x86_64-linux-gnu/libgmp.so.10.3.0 7f270b910000-7f270b911000 rw-p 0007f000 08:01 25745 /usr/li b/x86_64-linux-gnu/libgmp.so.10.3.0 7f270b911000-7f270b943000 r-xp 00000000 08:01 25693 /usr/li b/x86_64-linux-gnu/libhogweed.so.4.2 7f270b943000-7f270bb42000 ---p 00032000 08:01 25693 /usr/li b/x86_64-linux-gnu/libhogweed.so.4.2 7f270bb42000-7f270bb43000 r--p 00031000 08:01 25693 /usr/li b/x86_64-linux-gnu/libhogweed.so.4.2 7f270bb43000-7f270bb44000 rw-p 00032000 08:01 25693 /usr/li b/x86_64-linux-gnu/libhogweed.so.4.2 7f270bb44000-7f270bb78000 r-xp 00000000 08:01 25695 /usr/li b/x86_64-linux-gnu/libnettle.so.6.2 7f270bb78000-7f270bd77000 ---p 00034000 08:01 25695 /usr/li b/x86_64-linux-gnu/libnettle.so.6.2 7f270bd77000-7f270bd79000 r--p 00033000 08:01 25695 /usr/li b/x86_64-linux-gnu/libnettle.so.6.2 7f270bd79000-7f270bd7a000 rw-p 00035000 08:01 25695 /usr/li b/x86_64-linux-gnu/libnettle.so.6.2 7f270bd7a000-7f270bd8b000 r-xp 00000000 08:01 11367 /usr/li b/x86_64-linux-gnu/libtasn1.so.6.5.1 7f270bd8b000-7f270bf8b000 ---p 00011000 08:01 11367 /usr/li b/x86_64-linux-gnu/libtasn1.so.6.5.1 7f270bf8b000-7f270bf8c000 r--p 00011000 08:01 11367 /usr/li b/x86_64-linux-gnu/libtasn1.so.6.5.1 7f270bf8c000-7f270bf8d000 rw-p 00012000 08:01 11367 /usr/li b/x86_64-linux-gnu/libtasn1.so.6.5.1 7f270bf8d000-7f270bfbe000 r-xp 00000000 08:01 25697 /usr/li b/x86_64-linux-gnu/libidn.so.11.6.15 7f270bfbe000-7f270c1be000 ---p 00031000 08:01 25697 /usr/li b/x86_64-linux-gnu/libidn.so.11.6.15 7f270c1be000-7f270c1bf000 r--p 00031000 08:01 25697 /usr/li b/x86_64-linux-gnu/libidn.so.11.6.15 7f270c1bf000-7f270c1c0000 rw-p 00032000 08:01 25697 /usr/li b/x86_64-linux-gnu/libidn.so.11.6.15 7f270c1c0000-7f270c219000 r-xp 00000000 08:01 25699 /usr/li b/x86_64-linux-gnu/libp11-kit.so.0.1.0 7f270c219000-7f270c418000 ---p 00059000 08:01 25699 /usr/li b/x86_64-linux-gnu/libp11-kit.so.0.1.0 7f270c418000-7f270c422000 r--p 00058000 08:01 25699 /usr/li b/x86_64-linux-gnu/libp11-kit.so.0.1.0 7f270c422000-7f270c424000 rw-p 00062000 08:01 25699 /usr/li b/x86_64-linux-gnu/libp11-kit.so.0.1.0 7f270c424000-7f270c439000 r-xp 00000000 08:01 25841 /usr/li b/x86_64-linux-gnu/libroken.so.18.1.0 7f270c439000-7f270c638000 ---p 00015000 08:01 25841 /usr/li b/x86_64-linux-gnu/libroken.so.18.1.0 7f270c638000-7f270c639000 r--p 00014000 08:01 25841 /usr/li b/x86_64-linux-gnu/libroken.so.18.1.0 7f270c639000-7f270c63a000 rw-p 00015000 08:01 25841 /usr/li b/x86_64-linux-gnu/libroken.so.18.1.0 7f270c63a000-7f270c66a000 r-xp 00000000 08:01 25846 /usr/li b/x86_64-linux-gnu/libhcrypto.so.4.1.0 7f270c66a000-7f270c86a000 ---p 00030000 08:01 25846 /usr/li b/x86_64-linux-gnu/libhcrypto.so.4.1.0 7f270c86a000-7f270c86b000 r--p 00030000 08:01 25846 /usr/li b/x86_64-linux-gnu/libhcrypto.so.4.1.0 7f270c86b000-7f270c86c000 rw-p 00031000 08:01 25846 /usr/li b/x86_64-linux-gnu/libhcrypto.so.4.1.0 7f270c86c000-7f270c86d000 rw-p 00000000 00:00 0 7f270c86d000-7f270c90c000 r-xp 00000000 08:01 25843 /usr/li b/x86_64-linux-gnu/libasn1.so.8.0.0 7f270c90c000-7f270cb0b000 ---p 0009f000 08:01 25843 /usr/li b/x86_64-linux-gnu/libasn1.so.8.0.0 7f270cb0b000-7f270cb0c000 r--p 0009e000 08:01 25843 /usr/li b/x86_64-linux-gnu/libasn1.so.8.0.0 7f270cb0c000-7f270cb0f000 rw-p 0009f000 08:01 25843 /usr/li b/x86_64-linux-gnu/libasn1.so.8.0.0 7f270cb0f000-7f270cb93000 r-xp 00000000 08:01 25857 /usr/li b/x86_64-linux-gnu/libkrb5.so.26.0.0 7f270cb93000-7f270cd92000 ---p 00084000 08:01 25857 /usr/li b/x86_64-linux-gnu/libkrb5.so.26.0.0 7f270cd92000-7f270cd95000 r--p 00083000 08:01 25857 /usr/li b/x86_64-linux-gnu/libkrb5.so.26.0.0 7f270cd95000-7f270cd98000 rw-p 00086000 08:01 25857 /usr/li b/x86_64-linux-gnu/libkrb5.so.26.0.0 7f270cd98000-7f270cd99000 rw-p 00000000 00:00 0 7f270cd99000-7f270cda1000 r-xp 00000000 08:01 25859 /usr/li b/x86_64-linux-gnu/libheimntlm.so.0.1.0 7f270cda1000-7f270cfa0000 ---p 00008000 08:01 25859 /usr/li b/x86_64-linux-gnu/libheimntlm.so.0.1.0 7f270cfa0000-7f270cfa1000 r--p 00007000 08:01 25859 /usr/li b/x86_64-linux-gnu/libheimntlm.so.0.1.0 7f270cfa1000-7f270cfa2000 rw-p 00008000 08:01 25859 /usr/li b/x86_64-linux-gnu/libheimntlm.so.0.1.0 7f270cfa2000-7f270cfa5000 r-xp 00000000 08:01 2172 /lib/x8 6_64-linux-gnu/libkeyutils.so.1.5 7f270cfa5000-7f270d1a4000 ---p 00003000 08:01 2172 /lib/x8 6_64-linux-gnu/libkeyutils.so.1.5 7f270d1a4000-7f270d1a5000 r--p 00002000 08:01 2172 /lib/x8 6_64-linux-gnu/libkeyutils.so.1.5 7f270d1a5000-7f270d1a6000 rw-p 00003000 08:01 2172 /lib/x8 6_64-linux-gnu/libkeyutils.so.1.5 7f270d1a6000-7f270d2c9000 r-xp 00000000 08:01 17723 /usr/li b/x86_64-linux-gnu/libgnutls.so.30.6.2 7f270d2c9000-7f270d4c8000 ---p 00123000 08:01 17723 /usr/li b/x86_64-linux-gnu/libgnutls.so.30.6.2 7f270d4c8000-7f270d4d3000 r--p 00122000 08:01 17723 /usr/li b/x86_64-linux-gnu/libgnutls.so.30.6.2 7f270d4d3000-7f270d4d5000 rw-p 0012d000 08:01 17723 /usr/li b/x86_64-linux-gnu/libgnutls.so.30.6.2 7f270d4d5000-7f270d4d6000 rw-p 00000000 00:00 0 7f270d4d6000-7f270d513000 r-xp 00000000 08:01 25861 /usr/li b/x86_64-linux-gnu/libgssapi.so.3.0.0 7f270d513000-7f270d713000 ---p 0003d000 08:01 25861 /usr/li b/x86_64-linux-gnu/libgssapi.so.3.0.0 7f270d713000-7f270d714000 r--p 0003d000 08:01 25861 /usr/li b/x86_64-linux-gnu/libgssapi.so.3.0.0 7f270d714000-7f270d716000 rw-p 0003e000 08:01 25861 /usr/li b/x86_64-linux-gnu/libgssapi.so.3.0.0 7f270d716000-7f270d717000 rw-p 00000000 00:00 0 7f270d717000-7f270d730000 r-xp 00000000 08:01 25328 /usr/li b/x86_64-linux-gnu/libsasl2.so.2.0.25 7f270d730000-7f270d930000 ---p 00019000 08:01 25328 /usr/li b/x86_64-linux-gnu/libsasl2.so.2.0.25 7f270d930000-7f270d931000 r--p 00019000 08:01 25328 /usr/li b/x86_64-linux-gnu/libsasl2.so.2.0.25 7f270d931000-7f270d932000 rw-p 0001a000 08:01 25328 /usr/li b/x86_64-linux-gnu/libsasl2.so.2.0.25 7f270d932000-7f270d949000 r-xp 00000000 08:01 60588 /lib/x8 6_64-linux-gnu/libresolv-2.23.so 7f270d949000-7f270db49000 ---p 00017000 08:01 60588 /lib/x8 6_64-linux-gnu/libresolv-2.23.so 7f270db49000-7f270db4a000 r--p 00017000 08:01 60588 /lib/x8 6_64-linux-gnu/libresolv-2.23.so 7f270db4a000-7f270db4b000 rw-p 00018000 08:01 60588 /lib/x8 6_64-linux-gnu/libresolv-2.23.so 7f270db4b000-7f270db4d000 rw-p 00000000 00:00 0 7f270db4d000-7f270db5a000 r-xp 00000000 08:01 13045 /usr/li b/x86_64-linux-gnu/liblber-2.4.so.2.10.5 7f270db5a000-7f270dd5a000 ---p 0000d000 08:01 13045 /usr/li b/x86_64-linux-gnu/liblber-2.4.so.2.10.5 7f270dd5a000-7f270dd5b000 r--p 0000d000 08:01 13045 /usr/li b/x86_64-linux-gnu/liblber-2.4.so.2.10.5 7f270dd5b000-7f270dd5c000 rw-p 0000e000 08:01 13045 /usr/li b/x86_64-linux-gnu/liblber-2.4.so.2.10.5 7f270dd5c000-7f270dd66000 r-xp 00000000 08:01 41076 /usr/li b/x86_64-linux-gnu/libkrb5support.so.0.1 7f270dd66000-7f270df65000 ---p 0000a000 08:01 41076 /usr/li b/x86_64-linux-gnu/libkrb5support.so.0.1 7f270df65000-7f270df66000 r--p 00009000 08:01 41076 /usr/li b/x86_64-linux-gnu/libkrb5support.so.0.1 7f270df66000-7f270df67000 rw-p 0000a000 08:01 41076 /usr/li b/x86_64-linux-gnu/libkrb5support.so.0.1 7f270df67000-7f270df6a000 r-xp 00000000 08:01 18086 /lib/x8 6_64-linux-gnu/libcom_err.so.2.1 7f270df6a000-7f270e169000 ---p 00003000 08:01 18086 /lib/x8 6_64-linux-gnu/libcom_err.so.2.1 7f270e169000-7f270e16a000 r--p 00002000 08:01 18086 /lib/x8 6_64-linux-gnu/libcom_err.so.2.1 7f270e16a000-7f270e16b000 rw-p 00003000 08:01 18086 /lib/x8 6_64-linux-gnu/libcom_err.so.2.1 7f270e16b000-7f270e197000 r-xp 00000000 08:01 39874 /usr/li b/x86_64-linux-gnu/libk5crypto.so.3.1 7f270e197000-7f270e396000 ---p 0002c000 08:01 39874 /usr/li b/x86_64-linux-gnu/libk5crypto.so.3.1 7f270e396000-7f270e398000 r--p 0002b000 08:01 39874 /usr/li b/x86_64-linux-gnu/libk5crypto.so.3.1 7f270e398000-7f270e399000 rw-p 0002d000 08:01 39874 /usr/li b/x86_64-linux-gnu/libk5crypto.so.3.1 7f270e399000-7f270e39a000 rw-p 00000000 00:00 0 7f270e39a000-7f270e45d000 r-xp 00000000 08:01 40883 /usr/li b/x86_64-linux-gnu/libkrb5.so.3.3 7f270e45d000-7f270e65d000 ---p 000c3000 08:01 40883 /usr/li b/x86_64-linux-gnu/libkrb5.so.3.3 7f270e65d000-7f270e66a000 r--p 000c3000 08:01 40883 /usr/li b/x86_64-linux-gnu/libkrb5.so.3.3 7f270e66a000-7f270e66c000 rw-p 000d0000 08:01 40883 /usr/li b/x86_64-linux-gnu/libkrb5.so.3.3 7f270e66c000-7f270e6b9000 r-xp 00000000 08:01 13041 /usr/li b/x86_64-linux-gnu/libldap_r-2.4.so.2.10.5 7f270e6b9000-7f270e8b8000 ---p 0004d000 08:01 13041 /usr/li b/x86_64-linux-gnu/libldap_r-2.4.so.2.10.5 7f270e8b8000-7f270e8ba000 r--p 0004c000 08:01 13041 /usr/li b/x86_64-linux-gnu/libldap_r-2.4.so.2.10.5 7f270e8ba000-7f270e8bb000 rw-p 0004e000 08:01 13041 /usr/li b/x86_64-linux-gnu/libldap_r-2.4.so.2.10.5 7f270e8bb000-7f270e8bd000 rw-p 00000000 00:00 0 7f270e8bd000-7f270e904000 r-xp 00000000 08:01 39986 /usr/li b/x86_64-linux-gnu/libgssapi_krb5.so.2.2 7f270e904000-7f270eb03000 ---p 00047000 08:01 39986 /usr/li b/x86_64-linux-gnu/libgssapi_krb5.so.2.2 7f270eb03000-7f270eb05000 r--p 00046000 08:01 39986 /usr/li b/x86_64-linux-gnu/libgssapi_krb5.so.2.2 7f270eb05000-7f270eb07000 rw-p 00048000 08:01 39986 /usr/li b/x86_64-linux-gnu/libgssapi_krb5.so.2.2 7f270eb07000-7f270ed22000 r-xp 00000000 08:01 43204 /lib/x8 6_64-linux-gnu/libcrypto.so.1.0.0 7f270ed22000-7f270ef21000 ---p 0021b000 08:01 43204 /lib/x8 6_64-linux-gnu/libcrypto.so.1.0.0 7f270ef21000-7f270ef3d000 r--p 0021a000 08:01 43204 /lib/x8 6_64-linux-gnu/libcrypto.so.1.0.0 7f270ef3d000-7f270ef49000 rw-p 00236000 08:01 43204 /lib/x8 6_64-linux-gnu/libcrypto.so.1.0.0 7f270ef49000-7f270ef4c000 rw-p 00000000 00:00 0 7f270ef4c000-7f270efaa000 r-xp 00000000 08:01 43203 /lib/x8 6_64-linux-gnu/libssl.so.1.0.0 7f270efaa000-7f270f1aa000 ---p 0005e000 08:01 43203 /lib/x8 6_64-linux-gnu/libssl.so.1.0.0 7f270f1aa000-7f270f1ae000 r--p 0005e000 08:01 43203 /lib/x8 6_64-linux-gnu/libssl.so.1.0.0 7f270f1ae000-7f270f1b4000 rw-p 00062000 08:01 43203 /lib/x8 6_64-linux-gnu/libssl.so.1.0.0 7f270f1b4000-7f270f1f9000 r-xp 00000000 08:01 24901 /usr/li b/x86_64-linux-gnu/libpq.so.5.10 7f270f1f9000-7f270f3f8000 ---p 00045000 08:01 24901 /usr/li b/x86_64-linux-gnu/libpq.so.5.10 7f270f3f8000-7f270f3fb000 r--p 00044000 08:01 24901 /usr/li b/x86_64-linux-gnu/libpq.so.5.10 7f270f3fb000-7f270f3fc000 rw-p 00047000 08:01 24901 /usr/li b/x86_64-linux-gnu/libpq.so.5.10 7f270f3fc000-7f270f409000 r-xp 00000000 08:01 3072697 /usr/li b/R/site-library/RPostgreSQL/libs/RPostgreSQL.so 7f270f409000-7f270f608000 ---p 0000d000 08:01 3072697 /usr/li b/R/site-library/RPostgreSQL/libs/RPostgreSQL.so 7f270f608000-7f270f609000 r--p 0000c000 08:01 3072697 /usr/li b/R/site-library/RPostgreSQL/libs/RPostgreSQL.so 7f270f609000-7f270f60a000 rw-p 0000d000 08:01 3072697 /usr/li b/R/site-library/RPostgreSQL/libs/RPostgreSQL.so 7f270f60a000-7f270f7b0000 r-xp 00000000 08:21 23593734 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/udpipe/libs/udpipe.so 7f270f7b0000-7f270f9b0000 ---p 001a6000 08:21 23593734 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/udpipe/libs/udpipe.so 7f270f9b0000-7f270f9b3000 r--p 001a6000 08:21 23593734 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/udpipe/libs/udpipe.so 7f270f9b3000-7f270f9b4000 rw-p 001a9000 08:21 23593734 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/udpipe/libs/udpipe.so 7f270f9b4000-7f270f9b6000 rw-p 00000000 00:00 0 7f270f9b6000-7f270fa09000 r-xp 00000000 08:01 1312727 /usr/li b/R/site-library/data.table/libs/datatable.so 7f270fa09000-7f270fc09000 ---p 00053000 08:01 1312727 /usr/li b/R/site-library/data.table/libs/datatable.so 7f270fc09000-7f270fc0a000 r--p 00053000 08:01 1312727 /usr/li b/R/site-library/data.table/libs/datatable.so 7f270fc0a000-7f270fc0b000 rw-p 00054000 08:01 1312727 /usr/li b/R/site-library/data.table/libs/datatable.so 7f270fc0b000-7f270fc6f000 rw-p 00000000 00:00 0 7f270fc6f000-7f270fde1000 r-xp 00000000 08:01 23451 /usr/li b/x86_64-linux-gnu/libstdc++.so.6.0.21 7f270fde1000-7f270ffe1000 ---p 00172000 08:01 23451 /usr/li b/x86_64-linux-gnu/libstdc++.so.6.0.21 7f270ffe1000-7f270ffeb000 r--p 00172000 08:01 23451 /usr/li b/x86_64-linux-gnu/libstdc++.so.6.0.21 7f270ffeb000-7f270ffed000 rw-p 0017c000 08:01 23451 /usr/li b/x86_64-linux-gnu/libstdc++.so.6.0.21 7f270ffed000-7f270fff1000 rw-p 00000000 00:00 0 7f270fff1000-7f271004e000 r-xp 00000000 08:21 26740107 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/libs/Rcpp.so 7f271004e000-7f271024e000 ---p 0005d000 08:21 26740107 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/libs/Rcpp.so 7f271024e000-7f271024f000 r--p 0005d000 08:21 26740107 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/libs/Rcpp.so 7f271024f000-7f2710250000 rw-p 0005e000 08:21 26740107 /data/h ome/nba/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/libs/Rcpp.so 7f2710250000-7f2710257000 rw-p 00000000 00:00 0 7f2710257000-7f2710313000 r-xp 00000000 08:01 1286652 /usr/li b/R/library/Matrix/libs/Matrix.so 7f2710313000-7f2710513000 ---p 000bc000 08:01 1286652 /usr/li

jwijffels commented 3 years ago

Use https://github.com/nothings/stb/blob/master/stb_ds.h

jwijffels commented 3 years ago

Solutions:

after realloc in addToVocab:

for(long long a = m_vocab_size+1; a < m_vocab_capacity; a++){
      m_vocab[a].word = NULL;
      m_vocab[a].point = NULL;
      m_vocab[a].code = NULL;
    }

and delete the corpus correctly in trainmodelthread

TrainModelThread::~TrainModelThread()
{
  if(m_neu1) free(m_neu1);
  if(m_neu1e) free(m_neu1e);
  delete m_corpus;
}

or better:

TrainModelThread::~TrainModelThread()
{
  free(m_neu1);
  free(m_neu1e);
  delete m_corpus;
}
jwijffels commented 3 years ago

Solved in release 0.1.1 https://github.com/bnosac/doc2vec/releases/tag/0.1.1