Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.57k stars 974 forks source link

Test suite hits 2GB limit on 32bit R-devel #2767

Closed mattdowle closed 6 years ago

mattdowle commented 6 years ago

I've just seen this occur, here : https://ci.appveyor.com/project/Rdatatable/data-table/build/1.0.2337/job/3rppwcw31ent8g1r

Jan saw it too today here : https://github.com/Rdatatable/data.table/pull/2761#issuecomment-382277970

So, despite all our efforts, there's still more to do.

The error I saw is similar to Jan's :

  Running test id 1667.1      Error: memory exhausted (limit reached?)
  Error: memory exhausted (limit reached?)
  Error: memory exhausted (limit reached?)
  Execution halted

That message is R's, here : https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L2059 and looking at the context there (inside an err_malloc function) suggests AppVeyor instance is declining R's malloc request. (i.e. it doesn't seem like some kind of stack limit inside R itself).

@jangorecki

HughParsonage commented 6 years ago

Isn't i386 limited to 2 GB?

mattdowle commented 6 years ago

@HughParsonage Oh ... yes! Just checked and yes both jobs failed just on 32bit. Thanks!

So it's not AppVeyor then. It's just a 32bit / 2GB issue. That leaves us with why there's a gap between the 160MB in memtest chart and 2GB. If the memtest scale is 10x out (does the output of ps change scale perhaps and needs to be fixed scale by passing a parameter?) then that would make more sense: 1600MB for R would then make sense to try to reduce by focussing on the 3 tests towards the start of the test suite.

(I guess internal R object headers have increased a little bit in R-devel, causing, on 32-bit, R-release to be just under the 2GB limit and R-devel to be just over. Explaining why it's just R-devel.)

jangorecki commented 6 years ago

Failed build that happened to me was on

Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2012 R2 x64 (build 9600)

* using platform: x86_64-w64-mingw32 (64-bit)

And there were no suggested packages installed

Packages suggested but not available for checking:
  'bit64' 'curl' 'knitr' 'xts' 'nanotime' 'zoo' 'chron' 'reshape2'
  'testthat' 'GenomicRanges'

Probably we shouldn't use ps -o rss. There are many topics discussing that ps don't give proper memory measurement, also that RSS overestimate (surprisingly). I checked multiple sizes of data created in R and ps does not change unit, always returns KB. Will try to add another tool to memtest then.

mattdowle commented 6 years ago

Just on that error, although yes the platform is 64bit, the multiarch R CMD check runs tests for both 32bit and 64bit. When I checked your fail, I downloaded the failure.zip artifact and looked inside both i386 and x64 directories. Just the i386 directory contained the main.fail file while x64 directory contained main.out passing ok. Yes agree 'ps' next then. I can't think why the pattern of RAM usage makes sense but not the scale. The figures not being perfect makes sense, but not 10x off.

Oh ... how about just free -m. That's the one that matters and is reliable as far as I know. Deltas will work on that figure. In CI no other processes will be running, but if they are somehow, then we want to know anyway.

jangorecki commented 6 years ago

I checked locally memory usage with supposedly good tool /usr/bin/time --verbose (which is different app from bash time). Results shows that R-devel is actually eating much less memory than R 3.4.4 (at least on Linux). Still it is far from 2GB limit, anyway it doesn't seems to be related to test suite. More something about R-devel and memory management on 32bit Windows - maybe recent memory optimization does not play well on 32bit Win. Next step to debug this issue would be to measure memory on Windows. Building vignettes raises RSS by ~800MB on R 3.4.4 but they are disabled in appveyor script.

R 3.4.4

R CMD check took 364MB RSS.

    Command being timed: "R CMD check data.table_1.10.5.tar.gz --ignore-vignettes --no-manual"
    Maximum resident set size (kbytes): 364648

test.data.table took 351MB RSS.

    Command being timed: "Rscript -e require(data.table); test.data.table()"
    Maximum resident set size (kbytes): 351236

R-devel (2018-04-06)

R CMD check took 224MB RSS.

    Command being timed: "/home/jan/R-devel/bin/R CMD check data.table_1.10.5.tar.gz --ignore-vignettes --no-manual"
    Maximum resident set size (kbytes): 224640

test.data.table took 207MB RSS.

    Command being timed: "/home/jan/R-devel/bin/Rscript -e require(data.table); test.data.table()"
    Maximum resident set size (kbytes): 207032
mattdowle commented 6 years ago

Just happened on win-builder ... https://win-builder.r-project.org/0UGUaRk3ifo2/

** running tests for arch 'i386' ... [81s] ERROR
  Running 'autoprint.R' [1s]
  Comparing 'autoprint.Rout' to 'autoprint.Rout.save' ... OK
  Running 'knitr.R' [1s]
  Comparing 'knitr.Rout' to 'knitr.Rout.save' ... OK
  Running 'main.R' [76s]
  Running 'testthat.R' [2s]
Running the tests in 'tests/main.R' failed.
Complete output:
  > require(data.table)
  Loading required package: data.table
  > test.data.table()  # runs the main test suite of 5,000+ tests in /inst/tests/tests.Rraw
  Running D:/temp/Rtmpqkla50/RLIBS_169b8128a64aa/data.table/tests/tests.Rraw 

  Running test id 1.1      
  Running test id 1.2      
...
  Running test id 1648.086      
  Running test id 1648.087      
  Running test id 1648.088      
  Running test id 1648.089      
  Running test id 1648.09      
  Running test id 1648.091      Error: cannot allocate vector of size 4 Kb
  Execution halted
** running tests for arch 'x64' ... [135s] OK

so, as before, it's ok on 64bit but not 32bit.

MichaelChirico commented 6 years ago

have we done something like -- every 500 tests, clean up all the stale test objects laying around (x, y, DT, DT1, DT2, dt, df, ...)? should be minor but might help

HughParsonage commented 6 years ago

Perhaps change:

nq_fun = function(n=100L) {

to

nq_fun = function(n=if (identical(.Platform$OS.type, "windows") && identical(.Platform$r_arch, "i386")) 10L else 100L) {

or nq_fun = function(n = if (memory.limit() < 3*1024) 10L else 100L) {

mattdowle commented 6 years ago

@MichaelChirico Jan implemented the memory tracing but it doesn't look like it's using more than 150MB of RAM. It's odd. Yes I thought there would be some large stale objects to rm() but it doesn't seem like it from memtest.

@HughParsonage We don't think the nq tests are using the ram, though. Somewhere earlier. But ok might as well turn off on 32bit, no harm ...

jangorecki commented 6 years ago

My guess, not backed by anything, is that R 3.5.0 has some memory bug in 32bit win (AFAIK this issue is related only to this version/platform/arch). There were significant improvements for memory handling in R 3.5.0, on linux tests reports 207MB while R 3.4.4 took 351MB. You can always run tests.Rraw interactively and at the end investigate objects that are left after tests. Anyway debugging makes more sense using 32bit windows.

HughParsonage commented 6 years ago

I just ran test.data.table() on Windows 10 using the 32-bit version of R 3.5.0 and got through OK. (Some Bioconductor packages weren't installed though.)

jangorecki commented 6 years ago

@HughParsonage could you occupy most of memory by some other process? leave 1.5-2GB of memory for tests.

HughParsonage commented 6 years ago

With less than 1 GB available, I was able to run it without problems.

mattdowle commented 6 years ago

@HughParsonage bit64 installed and no messages from test.data.table() that it couldn't find packages? By Bioconductor packages do you mean GenomicRanges? It's possible the memory usage is there, I guess, if that's the only difference!

HughParsonage commented 6 years ago

Yes, sorry just GenomicRanges not installed.

mattdowle commented 6 years ago

Trouble is we're flying blind as neither AppVeyor or win-builder tells use what and when uses the ram. Jan said GenomicRanges pulls in a large bunch of dependencies, and we want to remove it anyway, so that would be reasonable to try next then.

mattdowle commented 6 years ago

Jan's idea is good to inspect the environment at the end of tests.Rraw to see if any large objects are left hanging around. If so, can be rm()'d after their test.

jangorecki commented 6 years ago

Installing GenomicRanges result installing 11 packages, one of them 17MB. Appveyor does not install GenomicRanges so issues are not related to it. Assuming win builder failure is caused by same issue as appveyor 32bit win r-devel builds. From r-devel windows build on appveyor:

Packages suggested but not available for checking:
  'bit64' 'curl' 'knitr' 'xts' 'nanotime' 'zoo' 'chron' 'reshape2'
  'testthat' 'GenomicRanges'

Nice helpers for checking memory usage of objects in workspace: https://stackoverflow.com/questions/1358003/tricks-to-manage-the-available-memory-in-an-r-session

mattdowle commented 6 years ago

The plot thickens. Just got this from win-builder (with latest master as of now because I was testing my commits turning off large tests on 32bit) ... https://win-builder.r-project.org/1W7om9wWK1Un

Running test id 1372.24      
  Running test id 1372.25      
  Running test id 1372.26      
  Running test id 1372.27      
  Running test id 1372.28      Error in initialize(value, ...) : 
    GC encountered a node (0x1089b0e0) with an unknown SEXP type: <unknown> at memory.c:1626
  Calls: test.data.table ... normalizeSingleBracketSubscript -> NSBS -> NSBS -> new2 -> new
  Execution halted

1372.* is GenomicRanges Jan's guess that there's a rare bug in R 3.5 on 32bit looks good.

jangorecki commented 6 years ago

So testing with GenomicRanges make sense... Mentioned error: https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L3043-L3062 We can eventually try moving GenomicRanges tests to other.Rraw and GenomicRanges from DESCRIPTION to inst/tests/tests-DESCRIPTION. While it may help to pass the tests, it won't solve underlying issue. Quite a lot of changes to memory.c in recent months: https://github.com/wch/r-source/commits/trunk/src/main/memory.c

HughParsonage commented 6 years ago

Is this installation error for zlibbioc (a dependency of GenomicRanges) illuminating?

trying URL 'https://bioconductor.org/packages/release/bioc/src/contrib/zlibbioc_1.24.0.tar.gz'
Content type 'application/x-gzip' length 257311 bytes (251 KB)
downloaded 251 KB

* installing *source* package 'zlibbioc' ...

   **********************************************
   WARNING: this package has a configure script
         It probably needs manual configuration
   **********************************************

** libs

*** arch - i386
mkdir -p "C:/R/R-3.5.0/library/zlibbioc/include" "C:/R/R-3.5.0/library/zlibbioc/libs/i386"
make -C zlib-1.2.5 -fwin32/Makefile.gcc libzbioc.a zlib1bioc.dll
make[1]: Entering directory `/cygdrive/c/Users/hughp/AppData/Local/Temp/Rtmp6HboKw/R.INSTALLb0c7c2c5bc/zlibbioc/src-i386/zlib-1.2.5'
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o adler32.o adler32.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o compress.o compress.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o crc32.o crc32.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o deflate.o deflate.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o gzclose.o gzclose.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o gzlib.o gzlib.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o gzread.o gzread.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o gzwrite.o gzwrite.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o infback.o infback.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o inffast.o inffast.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o inflate.o inflate.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o inftrees.o inftrees.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o trees.o trees.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o uncompr.o uncompr.c
gcc -m32  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o zutil.o zutil.c
ar rcs libzbioc.a adler32.o compress.o crc32.o deflate.o gzclose.o gzlib.o gzread.o gzwrite.o infback.o inffast.o inflate.o inftrees.o trees.o uncompr.o zutil.o 
windres --define GCC_WINDRES --target=pe-i386 -o zlibrc.o win32/zlib1.rc
gcc -m32 -shared -Wl,--out-implib,libzbiocdll.a  \
-o zlib1bioc.dll win32/zlib.def adler32.o compress.o crc32.o deflate.o gzclose.o gzlib.o gzread.o gzwrite.o infback.o inffast.o inflate.o inftrees.o trees.o uncompr.o zutil.o  zlibrc.o
strip zlib1bioc.dll
make[1]: Leaving directory `/cygdrive/c/Users/hughp/AppData/Local/Temp/Rtmp6HboKw/R.INSTALLb0c7c2c5bc/zlibbioc/src-i386/zlib-1.2.5'
cp zlib-1.2.5/libzbioc.a zlib-1.2.5/zlib1bioc.dll \
            "C:/R/R-3.5.0/library/zlibbioc/libs/i386"
cp zlib-1.2.5/zconf.h zlib-1.2.5/zlib.h "C:/R/R-3.5.0/library/zlibbioc/include"
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-35~1.0/include" -DNDEBUG          -O3 -Wall  -std=gnu99 -mtune=generic -c R_init_zlibbioc.c -o R_init_zlibbioc.o
c:/Rtools/mingw_32/bin/gcc -shared -s -static-libgcc -o zlibbioc.dll tmp.def R_init_zlibbioc.o -LC:/R/R-35~1.0/bin/i386 -lR
installing to C:/R/R-3.5.0/library/zlibbioc/libs/i386

*** arch - x64
mkdir -p "C:/R/R-3.5.0/library/zlibbioc/include" "C:/R/R-3.5.0/library/zlibbioc/libs/x64"
make -C zlib-1.2.5 -fwin32/Makefile.gcc libzbioc.a zlib1bioc.dll
make[1]: Entering directory `/cygdrive/c/Users/hughp/AppData/Local/Temp/Rtmp6HboKw/R.INSTALLb0c7c2c5bc/zlibbioc/src-x64/zlib-1.2.5'
gcc -m64  -O3 -Wall -DNO_VIZ -DZ_PREFIX -c -o adler32.o adler32.c
adler32.c:1:0: sorry, unimplemented: 64-bit mode not compiled in
 /* adler32.c -- compute the Adler-32 checksum of a data stream
 ^
make[1]: *** [adler32.o] Error 1
make[1]: Leaving directory `/cygdrive/c/Users/hughp/AppData/Local/Temp/Rtmp6HboKw/R.INSTALLb0c7c2c5bc/zlibbioc/src-x64/zlib-1.2.5'
make: *** [libzbioc] Error 2
ERROR: compilation failed for package 'zlibbioc'
* removing 'C:/R/R-3.5.0/library/zlibbioc'
In R CMD INSTALL
Warning in install.packages :
  installation of package ‘C:/Users/hughp/AppData/Local/Temp/Rtmpy6Q6nh/downloaded_packages/zlibbioc_1.24.0.tar.gz’ had non-zero exit status
mattdowle commented 6 years ago

With GenomicRanges gone, we passed win-builder ! https://win-builder.r-project.org/EdA01oGY8uCP https://win-builder.r-project.org/fmpoU9U9p8Xs

jangorecki commented 6 years ago

While this fixes win builder the issue that occurs in appveyor is still valid as we don't even use GenomicRanges there.

mattdowle commented 6 years ago

Failed on win-builder again : https://win-builder.r-project.org/3qtvkGEiFCxt/

mattdowle commented 6 years ago

R-devel compiled with --strict-barrier and ASAN (CRAN_Release.cmd:188) found the following. Relief! So it was the GenomicRanges test, but not GenomicRanges per se, or the data size. Looks like a well-defined fault in ijoin.c. The memory exhausted errors we've been seeing are feasible when there's in fact an earlier memory corruption, not memory usage at all. This fits with the win-builder fails. Not sure about AppVeyor, though, since as Jan said, AppVeyor wasn't running the GenomicRanges test as it didn't have it installed. AppVeyor is running that test now, though. Could be another different fault after this one (will see when this one fixed). Could be that foverlaps() in other tests triggers this line (those tests don't compare to GenomicRanges and were always running.) Anyway ...

Running test id 1372.02      ijoin.c:157:24: runtime error: store to misaligned address 0x000000000001 for type 'int', which requires 4 byte alignment
0x000000000001: note: pointer points here
<memory cannot be printed>
ASAN:DEADLYSIGNAL
=================================================================
==28056==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x7f3e3e0b23b4 bp 0x7fff14f7a810 sp 0x7fff14f7a730 T0)
==28056==The signal is caused by a WRITE memory access.
==28056==Hint: address points to the zero page.
    #0 0x7f3e3e0b23b3 in lookup /tmp/Rtmp8649wu/R.INSTALL6cedcd51d6d/data.table/src/ijoin.c:157
    #1 0x55c02c4bcfbb in R_doDotCall /home/mdowle/build/R-devel/src/main/dotcode.c:596
    #2 0x55c02c5e3cd5 in bcEval /home/mdowle/build/R-devel/src/main/eval.c:7279
    #3 0x55c02c588a6d in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:624
    #4 0x55c02c58db20 in R_execClosure /home/mdowle/build/R-devel/src/main/eval.c:1764
    #5 0x55c02c58d006 in Rf_applyClosure /home/mdowle/build/R-devel/src/main/eval.c:1692
    #6 0x55c02c5898a3 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:747
    #7 0x55c02c59754e in do_set /home/mdowle/build/R-devel/src/main/eval.c:2778
    #8 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #9 0x55c02c595781 in do_begin /home/mdowle/build/R-devel/src/main/eval.c:2395
    #10 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #11 0x55c02c593daf in do_for /home/mdowle/build/R-devel/src/main/eval.c:2281
    #12 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #13 0x55c02c595781 in do_begin /home/mdowle/build/R-devel/src/main/eval.c:2395
    #14 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #15 0x55c02c593daf in do_for /home/mdowle/build/R-devel/src/main/eval.c:2281
    #16 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #17 0x55c02c595781 in do_begin /home/mdowle/build/R-devel/src/main/eval.c:2395
    #18 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #19 0x55c02c593daf in do_for /home/mdowle/build/R-devel/src/main/eval.c:2281
    #20 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #21 0x55c02c59ad25 in do_eval /home/mdowle/build/R-devel/src/main/eval.c:3181
    #22 0x55c02c6a3b76 in do_internal /home/mdowle/build/R-devel/src/main/names.c:1371
    #23 0x55c02c589202 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:699
    #24 0x55c02c58db20 in R_execClosure /home/mdowle/build/R-devel/src/main/eval.c:1764
    #25 0x55c02c58d006 in Rf_applyClosure /home/mdowle/build/R-devel/src/main/eval.c:1692
    #26 0x55c02c5b8354 in bcEval /home/mdowle/build/R-devel/src/main/eval.c:6739
    #27 0x55c02c588a6d in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:624
    #28 0x55c02c58db20 in R_execClosure /home/mdowle/build/R-devel/src/main/eval.c:1764
    #29 0x55c02c58d006 in Rf_applyClosure /home/mdowle/build/R-devel/src/main/eval.c:1692
    #30 0x55c02c5b8354 in bcEval /home/mdowle/build/R-devel/src/main/eval.c:6739
    #31 0x55c02c588a6d in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:624
    #32 0x55c02c58db20 in R_execClosure /home/mdowle/build/R-devel/src/main/eval.c:1764
    #33 0x55c02c58d006 in Rf_applyClosure /home/mdowle/build/R-devel/src/main/eval.c:1692
    #34 0x55c02c5898a3 in Rf_eval /home/mdowle/build/R-devel/src/main/eval.c:747
    #35 0x55c02c64305d in Rf_ReplIteration /home/mdowle/build/R-devel/src/main/main.c:258
    #36 0x55c02c6434ba in R_ReplConsole /home/mdowle/build/R-devel/src/main/main.c:308
    #37 0x55c02c64727e in run_Rmainloop /home/mdowle/build/R-devel/src/main/main.c:1082
    #38 0x55c02c647294 in Rf_mainloop /home/mdowle/build/R-devel/src/main/main.c:1089
    #39 0x55c02c37e322 in main /home/mdowle/build/R-devel/src/main/Rmain.c:29
    #40 0x7f3e4d1531c0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x211c0)
    #41 0x55c02c37e1d9 in _start (/home/mdowle/build/R-devel/bin/exec/R+0x6061d9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/Rtmp8649wu/R.INSTALL6cedcd51d6d/data.table/src/ijoin.c:157 in lookup
==28056==ABORTING
mattdowle commented 6 years ago

win-builder passing OK again. Three repeated submissions all passed. Let's put that down to the (now fixed) ijoin.c segv then.

I never saw the memory exhausted (limit reached?) on win-builder, only on AppVeyor. This issue #2767 remains open then and seems AppVeyor-only. Since win-builder is now consistently passing OK, that's enough to go to CRAN.

mattdowle commented 6 years ago

I may be imagining it, but AppVeyor 32bit seems more stable now after 3-4 runs. It was still failing with R's memory exhausted (limit reached?) error after the ijoin.c segv fix (so that didn't fix it). What's changed since then is I added back these lines to appveyor.yml :

# Include the subset of Suggests that test.data.table() has tests for :
- travis-tool.sh r_install bit64
- travis-tool.sh r_install xts
- travis-tool.sh r_install nanotime
- travis-tool.sh r_install chron

So that no tests are skipped by AppVeyor. It's possible that there's a sample() (or similar) in a test somewhere that doesn't have a set.seed(). Such test's data would depend on the sampling that has happened in the test suite up to that point, since the last set.seed(). When these extra tests now run (bit64, xts, etc) perhaps they do some sampling and change the data in the pesky test afterwards, by their existence. Perhaps the pesky test is taking a different branch now on the different data and not hitting the pesky line anymore. So, to reproduce, removing these 4 packages locally and rerunning extra-strong R-devel --strict-barrier ASAN, together with valgrid instrumentation, might reveal the memory fault. Since it's still feasible that the memory exhausted is an artifact of a memory fault. There are still some uninitialized variables to clear up from valgrind, too. And there's full 100% coverage yet to achieve, too.

mattdowle commented 6 years ago

I removed those packages locally (bit64, xts, nanotime and chron) and reran test.data.table() in R compiled with ASAN and strict-barrier, in the hopes of seeing some hint of problem. But nothing. Works fine here. I also ran 32bit R-devel (2018-04-28 r74669) on Windows without those 4 packages installed. Just like AppVeyor was doing. Also fine. No sign of either crash or memory fault. Task manager doesn't show anything close to R using 2GB; just 200MB, consistent with other results. Have to do valgrind and new problem in R-devel anyway (#2820). Maybe it was to do with that.

mattdowle commented 6 years ago

One final attempt. On Windows 32bit R-devel I did the following :

for (i in 1:10) if (!test.data.table()) stop("one or more tests failed")

and then after a few iterations I saw it!!

Running test id 1253.466    Error: memory exhausted (limit reached?)
Error: memory exhausted (limit reached?)
Error: memory exhausted (limit reached?)

And indeed the R process (32bit) is close to 4GB. image

which is way above what we have observed and were expecting since we know the test suite doesn't use that much.

Does anyone else see it on Windows? Need to repeat 10 times like that it seems.

Maybe if I repeat 10 times locally it'll hit that locally then I can trace with linux tools. Will try ...

Feels to me like a true data.table bug that is triggered by random data in a test causing a pesky branch to be hit. Not actually 32-bit or Windows related. Just showing up there, so far, by chance.

This could be the same as #2822, albeit a crash there.

mattdowle commented 6 years ago

The latest 'memory exhausted' fail occurred on R-devel only, as before. But both 32bit and 64bit this time so it's not 32bit-only. Also, the suggests were loaded this time and their tests ran. So it's not to do with whether bit64, xts or nanotime are installed or not, either.

Link to log : https://ci.appveyor.com/project/Rdatatable/data-table/build/1.0.2472/job/17yi0fe0562dht8v

Copy and paste of AppVeyor log ``` Build started $ErrorActionPreference = "Stop" Invoke-WebRequest http://raw.github.com/krlmlr/r-appveyor/master/scripts/appveyor-tool.ps1 -OutFile "..\appveyor-tool.ps1" Import-Module '..\appveyor-tool.ps1' git clone -q https://github.com/Rdatatable/data.table.git C:\projects\data-table git fetch -q origin +refs/pull/2832/merge: git checkout -qf FETCH_HEAD Running Install scripts Bootstrap == 05/04/2018 02:22:51: Bootstrap: Start == 05/04/2018 02:22:51: Adding GnuWin32 tools to PATH == 05/04/2018 02:22:51: Setting time zone UTC GMT Standard Time == 05/04/2018 02:22:51: Version: devel == 05/04/2018 02:22:51: URL path: == 05/04/2018 02:22:51: Downloading R from: https://cloud.r-project.org/bin/windows/base/R-devel-win.exe == 05/04/2018 02:22:54: Running R installer R is now available on drive C: == 05/04/2018 02:23:16: Setting PATH == 05/04/2018 02:23:16: Testing R installation R Under development (unstable) (2018-05-02 r74682) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2012 R2 x64 (build 9600) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.6.0 == 05/04/2018 02:23:17: Determining Rtools version == 05/04/2018 02:23:17: Downloading Rtools from: https://cloud.r-project.org/bin/windows/Rtools/Rtools35.exe == 05/04/2018 02:23:20: Running Rtools installer Rtools is now available on drive C: == 05/04/2018 02:24:04: Setting PATH == 05/04/2018 02:24:04: Downloading and installing travis-tool.sh @bash.exe ../travis-tool.sh %* ^\.travis\.yml$ ^CONTRIBUTING\.md$ ^ISSUE_TEMPLATE\.md$ ^CRAN_Release\.cmd$ ^deploy\.sh$ ^\.emacs\.desktop ^\.emacs\.desktop\.lock ^.*\.Rproj$ ^\.Rproj\.user$ ^appveyor\.yml$ ^.gitlab-ci\.yml$ ^\.idea$ ^\.RData$ ^\.Rhistory$ ^data\.table_.*\.tar\.gz$ ^cc\.R$ ^Makefile$ ^NEWS.0.md$ ^travis-tool\.sh\.cmd$ C:\Rtools\bin C:\Rtools\MinGW\bin C:\Rtools\mingw_64\bin C:\R\bin\x64 C:\Rtools\mingw_64\bin C:\MinGW\msys\1.0\bin C:\Program Files (x86)\Git\bin C:\Perl\site\bin C:\Perl\bin C:\Windows\system32 C:\Windows C:\Windows\System32\Wbem C:\Windows\System32\WindowsPowerShell\v1.0\ C:\Program Files\7-Zip C:\Program Files\Microsoft\Web Platform Installer\ C:\Tools\GitVersion C:\Tools\PsTools C:\Program Files\Git LFS C:\Program Files (x86)\Subversion\bin C:\Program Files\Microsoft SQL Server\120\Tools\Binn\ C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\ C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ C:\Program Files\Microsoft SQL Server\120\DTS\Binn\ C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Binn\ManagementStudio\ C:\Tools\WebDriver C:\Program Files (x86)\Microsoft SDKs\TypeScript\1.4\ C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\PrivateAssemblies\ C:\Program Files (x86)\Microsoft SDKs\Azure\CLI\wbin C:\Ruby193\bin C:\Tools\NUnit\bin C:\Tools\xUnit C:\Tools\MSpec C:\Tools\Coverity\bin C:\Program Files (x86)\CMake\bin C:\go\bin C:\Program Files\Java\jdk1.8.0\bin C:\Python27 C:\Program Files\nodejs C:\Program Files (x86)\iojs C:\Program Files\iojs C:\Users\appveyor\AppData\Roaming\npm C:\Program Files\Microsoft SQL Server\130\Tools\Binn\ C:\Program Files (x86)\MSBuild\14.0\Bin C:\Tools\NuGet C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\CommonExtensions\Microsoft\TestWindow C:\Program Files\Microsoft DNX\Dnvm C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\ C:\Program Files (x86)\Microsoft SQL Server\130\Tools\Binn\ C:\Program Files (x86)\Microsoft SQL Server\130\DTS\Binn\ C:\Program Files\Microsoft SQL Server\130\DTS\Binn\ C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn\ C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\ C:\Program Files (x86)\Apache\Maven\bin C:\Python27\Scripts C:\Tools\NUnit3 C:\Program Files\Mercurial\ C:\Program Files\LLVM\bin C:\Program Files\dotnet\ C:\Tools\curl\bin C:\Program Files\Amazon\AWSCLI\ C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\ C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\Extensions\Microsoft\SQLDB\DAC\140 C:\Tools\vcpkg C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ C:\Program Files\Microsoft SQL Server\140\Tools\Binn\ C:\Program Files\Microsoft SQL Server\140\DTS\Binn\ C:\Program Files\PowerShell\6.0.0\ C:\Program Files\erl9.2\bin C:\Program Files (x86)\nodejs\ C:\Program Files\Git\cmd C:\Program Files\Git\usr\bin C:\Program Files (x86)\Yarn\bin\ C:\Program Files (x86)\NSIS C:\Tools\Octopus C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code C:\Program Files\Microsoft SDKs\Service Fabric\Tools\ServiceFabricLocalClusterManager C:\ProgramData\chocolatey\bin C:\Users\appveyor\AppData\Roaming\npm C:\Users\appveyor\AppData\Local\Yarn\bin C:\Program Files\AppVeyor\BuildAgent\ == 05/04/2018 02:24:04: Setting R_LIBS_USER Directory: C:\ Mode LastWriteTime Length Name ---- ------------- ------ ---- d----- 5/4/2018 2:24 AM RLibrary == 05/04/2018 02:24:04: Setting TAR to 'internal' == 05/04/2018 02:24:04: Bootstrap: Done ECHO no Revision metadata added to DESCRIPTION no Revision metadata added to DESCRIPTION set _R_CHECK_FORCE_SUGGESTS_=false travis-tool.sh r_install bit64 + CRAN=http://cloud.r-project.org/ + BIOC=http://bioconductor.org/biocLite.R + PKGTYPE=both + BIOC_USE_DEVEL=TRUE ++ uname -s + OS=MSYS_NT-6.3-WOW + PANDOC_VERSION=1.13.1 + PANDOC_DIR=/opt/pandoc + PANDOC_URL=https://s3.amazonaws.com/rstudio-buildtools/pandoc-1.13.1.zip + PATH='/c/Rtools/bin:/c/Rtools/MinGW/bin:/c/Rtools/mingw_64/bin:/c/R/bin/x64:/c/Rtools/mingw_64/bin:/usr/bin:/c/Program Files (x86)/Git/bin:/c/Perl/site/bin:/c/Perl/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/c/Program Files/7-Zip:/c/Program Files/Microsoft/Web Platform Installer/:/c/Tools/GitVersion:/c/Tools/PsTools:/c/Program Files/Git LFS:/c/Program Files (x86)/Subversion/bin:/c/Program Files/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/110/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/ManagementStudio/:/c/Tools/WebDriver:/c/Program Files (x86)/Microsoft SDKs/TypeScript/1.4/:/c/Program Files (x86)/Microsoft Visual Studio 12.0/Common7/IDE/PrivateAssemblies/:/c/Program Files (x86)/Microsoft SDKs/Azure/CLI/wbin:/c/Ruby193/bin:/c/Tools/NUnit/bin:/c/Tools/xUnit:/c/Tools/MSpec:/c/Tools/Coverity/bin:/c/Program Files (x86)/CMake/bin:/c/go/bin:/c/Program Files/Java/jdk1.8.0/bin:/c/Python27:/c/Program Files/nodejs:/c/Program Files (x86)/iojs:/c/Program Files/iojs:/c/Users/appveyor/AppData/Roaming/npm:/c/Program Files/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/MSBuild/14.0/Bin:/c/Tools/NuGet:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/CommonExtensions/Microsoft/TestWindow:/c/Program Files/Microsoft DNX/Dnvm:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/110/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Apache/Maven/bin:/c/Python27/Scripts:/c/Tools/NUnit3:/c/Program Files/Mercurial/:/c/Program Files/LLVM/bin:/c/Program Files/dotnet/:/c/Tools/curl/bin:/c/Program Files/Amazon/AWSCLI/:/c/Program Files (x86)/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/Extensions/Microsoft/SQLDB/DAC/140:/c/Tools/vcpkg:/c/Program Files (x86)/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files/PowerShell/6.0.0/:/c/Program Files/erl9.2/bin:/c/Program Files (x86)/nodejs/:/c/Program Files/Git/cmd:/c/Program Files/Git/usr/bin:/c/Program Files (x86)/Yarn/bin/:/c/Program Files (x86)/NSIS:/c/Tools/Octopus:/c/Program Files/Microsoft Service Fabric/bin/Fabric/Fabric.Code:/c/Program Files/Microsoft SDKs/Service Fabric/Tools/ServiceFabricLocalClusterManager:/c/ProgramData/chocolatey/bin:/c/Users/appveyor/AppData/Roaming/npm:/c/Users/appveyor/AppData/Local/Yarn/bin:/c/Program Files/AppVeyor/BuildAgent/:/usr/texbin' + R_BUILD_ARGS=--no-manual + R_CHECK_ARGS=--no-manual + R_USE_BIOC_CMDS='source('\''http://bioconductor.org/biocLite.R'\''); tryCatch(useDevel(TRUE), error=function(e) {if (!grepl('\''already in use'\'', e)) {e}}); options(repos=biocinstallRepos());' + COMMAND=r_install + echo 'Running command: r_install' Running command: r_install + shift + case $COMMAND in + RInstall bit64 + [[ '' == \b\i\t\6\4 ]] + echo 'Installing R package(s): bit64' Installing R package(s): bit64 + Rscript -e 'install.packages(commandArgs(TRUE), repos="http://cloud.r-project.org/", INSTALL_opts="", type="both")' bit64 Installing package into 'c:/RLibrary' (as 'lib' is unspecified) also installing the dependency 'bit' trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/bit_1.1-12.zip' Content type 'application/zip' length 240057 bytes (234 KB) ================================================== downloaded 234 KB trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/bit64_0.9-7.zip' Content type 'application/zip' length 546956 bytes (534 KB) ================================================== downloaded 534 KB package 'bit' successfully unpacked and MD5 sums checked package 'bit64' successfully unpacked and MD5 sums checked The downloaded binary packages are in C:\Users\appveyor\AppData\Local\Temp\1\RtmpoX0Fgh\downloaded_packages travis-tool.sh r_install xts + CRAN=http://cloud.r-project.org/ + BIOC=http://bioconductor.org/biocLite.R + PKGTYPE=both + BIOC_USE_DEVEL=TRUE ++ uname -s + OS=MSYS_NT-6.3-WOW + PANDOC_VERSION=1.13.1 + PANDOC_DIR=/opt/pandoc + PANDOC_URL=https://s3.amazonaws.com/rstudio-buildtools/pandoc-1.13.1.zip + PATH='/c/Rtools/bin:/c/Rtools/MinGW/bin:/c/Rtools/mingw_64/bin:/c/R/bin/x64:/c/Rtools/mingw_64/bin:/usr/bin:/c/Program Files (x86)/Git/bin:/c/Perl/site/bin:/c/Perl/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/c/Program Files/7-Zip:/c/Program Files/Microsoft/Web Platform Installer/:/c/Tools/GitVersion:/c/Tools/PsTools:/c/Program Files/Git LFS:/c/Program Files (x86)/Subversion/bin:/c/Program Files/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/110/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/ManagementStudio/:/c/Tools/WebDriver:/c/Program Files (x86)/Microsoft SDKs/TypeScript/1.4/:/c/Program Files (x86)/Microsoft Visual Studio 12.0/Common7/IDE/PrivateAssemblies/:/c/Program Files (x86)/Microsoft SDKs/Azure/CLI/wbin:/c/Ruby193/bin:/c/Tools/NUnit/bin:/c/Tools/xUnit:/c/Tools/MSpec:/c/Tools/Coverity/bin:/c/Program Files (x86)/CMake/bin:/c/go/bin:/c/Program Files/Java/jdk1.8.0/bin:/c/Python27:/c/Program Files/nodejs:/c/Program Files (x86)/iojs:/c/Program Files/iojs:/c/Users/appveyor/AppData/Roaming/npm:/c/Program Files/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/MSBuild/14.0/Bin:/c/Tools/NuGet:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/CommonExtensions/Microsoft/TestWindow:/c/Program Files/Microsoft DNX/Dnvm:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/110/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Apache/Maven/bin:/c/Python27/Scripts:/c/Tools/NUnit3:/c/Program Files/Mercurial/:/c/Program Files/LLVM/bin:/c/Program Files/dotnet/:/c/Tools/curl/bin:/c/Program Files/Amazon/AWSCLI/:/c/Program Files (x86)/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/Extensions/Microsoft/SQLDB/DAC/140:/c/Tools/vcpkg:/c/Program Files (x86)/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files/PowerShell/6.0.0/:/c/Program Files/erl9.2/bin:/c/Program Files (x86)/nodejs/:/c/Program Files/Git/cmd:/c/Program Files/Git/usr/bin:/c/Program Files (x86)/Yarn/bin/:/c/Program Files (x86)/NSIS:/c/Tools/Octopus:/c/Program Files/Microsoft Service Fabric/bin/Fabric/Fabric.Code:/c/Program Files/Microsoft SDKs/Service Fabric/Tools/ServiceFabricLocalClusterManager:/c/ProgramData/chocolatey/bin:/c/Users/appveyor/AppData/Roaming/npm:/c/Users/appveyor/AppData/Local/Yarn/bin:/c/Program Files/AppVeyor/BuildAgent/:/usr/texbin' + R_BUILD_ARGS=--no-manual + R_CHECK_ARGS=--no-manual + R_USE_BIOC_CMDS='source('\''http://bioconductor.org/biocLite.R'\''); tryCatch(useDevel(TRUE), error=function(e) {if (!grepl('\''already in use'\'', e)) {e}}); options(repos=biocinstallRepos());' + COMMAND=r_install + echo 'Running command: r_install' Running command: r_install + shift + case $COMMAND in + RInstall xts + [[ '' == \x\t\s ]] + echo 'Installing R package(s): xts' Installing R package(s): xts + Rscript -e 'install.packages(commandArgs(TRUE), repos="http://cloud.r-project.org/", INSTALL_opts="", type="both")' xts Installing package into 'c:/RLibrary' (as 'lib' is unspecified) also installing the dependency 'zoo' trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/zoo_1.8-1.zip' Content type 'application/zip' length 1088462 bytes (1.0 MB) ================================================== downloaded 1.0 MB trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/xts_0.10-2.zip' Content type 'application/zip' length 944787 bytes (922 KB) ================================================== downloaded 922 KB package 'zoo' successfully unpacked and MD5 sums checked package 'xts' successfully unpacked and MD5 sums checked The downloaded binary packages are in C:\Users\appveyor\AppData\Local\Temp\1\RtmpYhGqGS\downloaded_packages travis-tool.sh r_install nanotime + CRAN=http://cloud.r-project.org/ + BIOC=http://bioconductor.org/biocLite.R + PKGTYPE=both + BIOC_USE_DEVEL=TRUE ++ uname -s + OS=MSYS_NT-6.3-WOW + PANDOC_VERSION=1.13.1 + PANDOC_DIR=/opt/pandoc + PANDOC_URL=https://s3.amazonaws.com/rstudio-buildtools/pandoc-1.13.1.zip + PATH='/c/Rtools/bin:/c/Rtools/MinGW/bin:/c/Rtools/mingw_64/bin:/c/R/bin/x64:/c/Rtools/mingw_64/bin:/usr/bin:/c/Program Files (x86)/Git/bin:/c/Perl/site/bin:/c/Perl/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/c/Program Files/7-Zip:/c/Program Files/Microsoft/Web Platform Installer/:/c/Tools/GitVersion:/c/Tools/PsTools:/c/Program Files/Git LFS:/c/Program Files (x86)/Subversion/bin:/c/Program Files/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/110/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/ManagementStudio/:/c/Tools/WebDriver:/c/Program Files (x86)/Microsoft SDKs/TypeScript/1.4/:/c/Program Files (x86)/Microsoft Visual Studio 12.0/Common7/IDE/PrivateAssemblies/:/c/Program Files (x86)/Microsoft SDKs/Azure/CLI/wbin:/c/Ruby193/bin:/c/Tools/NUnit/bin:/c/Tools/xUnit:/c/Tools/MSpec:/c/Tools/Coverity/bin:/c/Program Files (x86)/CMake/bin:/c/go/bin:/c/Program Files/Java/jdk1.8.0/bin:/c/Python27:/c/Program Files/nodejs:/c/Program Files (x86)/iojs:/c/Program Files/iojs:/c/Users/appveyor/AppData/Roaming/npm:/c/Program Files/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/MSBuild/14.0/Bin:/c/Tools/NuGet:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/CommonExtensions/Microsoft/TestWindow:/c/Program Files/Microsoft DNX/Dnvm:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/110/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Apache/Maven/bin:/c/Python27/Scripts:/c/Tools/NUnit3:/c/Program Files/Mercurial/:/c/Program Files/LLVM/bin:/c/Program Files/dotnet/:/c/Tools/curl/bin:/c/Program Files/Amazon/AWSCLI/:/c/Program Files (x86)/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/Extensions/Microsoft/SQLDB/DAC/140:/c/Tools/vcpkg:/c/Program Files (x86)/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files/PowerShell/6.0.0/:/c/Program Files/erl9.2/bin:/c/Program Files (x86)/nodejs/:/c/Program Files/Git/cmd:/c/Program Files/Git/usr/bin:/c/Program Files (x86)/Yarn/bin/:/c/Program Files (x86)/NSIS:/c/Tools/Octopus:/c/Program Files/Microsoft Service Fabric/bin/Fabric/Fabric.Code:/c/Program Files/Microsoft SDKs/Service Fabric/Tools/ServiceFabricLocalClusterManager:/c/ProgramData/chocolatey/bin:/c/Users/appveyor/AppData/Roaming/npm:/c/Users/appveyor/AppData/Local/Yarn/bin:/c/Program Files/AppVeyor/BuildAgent/:/usr/texbin' + R_BUILD_ARGS=--no-manual + R_CHECK_ARGS=--no-manual + R_USE_BIOC_CMDS='source('\''http://bioconductor.org/biocLite.R'\''); tryCatch(useDevel(TRUE), error=function(e) {if (!grepl('\''already in use'\'', e)) {e}}); options(repos=biocinstallRepos());' + COMMAND=r_install + echo 'Running command: r_install' Running command: r_install + shift + case $COMMAND in + RInstall nanotime + [[ '' == \n\a\n\o\t\i\m\e ]] + echo 'Installing R package(s): nanotime' Installing R package(s): nanotime + Rscript -e 'install.packages(commandArgs(TRUE), repos="http://cloud.r-project.org/", INSTALL_opts="", type="both")' nanotime Installing package into 'c:/RLibrary' (as 'lib' is unspecified) also installing the dependencies 'Rcpp', 'RcppCCTZ' trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/Rcpp_0.12.16.zip' Content type 'application/zip' length 4504406 bytes (4.3 MB) ================================================== downloaded 4.3 MB trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/RcppCCTZ_0.2.3.zip' Content type 'application/zip' length 737064 bytes (719 KB) ================================================== downloaded 719 KB trying URL 'http://cloud.r-project.org/bin/windows/contrib/3.6/nanotime_0.2.0.zip' Content type 'application/zip' length 114037 bytes (111 KB) ================================================== downloaded 111 KB package 'Rcpp' successfully unpacked and MD5 sums checked package 'RcppCCTZ' successfully unpacked and MD5 sums checked package 'nanotime' successfully unpacked and MD5 sums checked The downloaded binary packages are in C:\Users\appveyor\AppData\Local\Temp\1\Rtmpcd9ieQ\downloaded_packages travis-tool.sh run_tests + CRAN=http://cloud.r-project.org/ + BIOC=http://bioconductor.org/biocLite.R + PKGTYPE=both + BIOC_USE_DEVEL=TRUE ++ uname -s + OS=MSYS_NT-6.3-WOW + PANDOC_VERSION=1.13.1 + PANDOC_DIR=/opt/pandoc + PANDOC_URL=https://s3.amazonaws.com/rstudio-buildtools/pandoc-1.13.1.zip + PATH='/c/Rtools/bin:/c/Rtools/MinGW/bin:/c/Rtools/mingw_64/bin:/c/R/bin/x64:/c/Rtools/mingw_64/bin:/usr/bin:/c/Program Files (x86)/Git/bin:/c/Perl/site/bin:/c/Perl/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/c/Program Files/7-Zip:/c/Program Files/Microsoft/Web Platform Installer/:/c/Tools/GitVersion:/c/Tools/PsTools:/c/Program Files/Git LFS:/c/Program Files (x86)/Subversion/bin:/c/Program Files/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/110/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/:/c/Program Files/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/Tools/Binn/ManagementStudio/:/c/Tools/WebDriver:/c/Program Files (x86)/Microsoft SDKs/TypeScript/1.4/:/c/Program Files (x86)/Microsoft Visual Studio 12.0/Common7/IDE/PrivateAssemblies/:/c/Program Files (x86)/Microsoft SDKs/Azure/CLI/wbin:/c/Ruby193/bin:/c/Tools/NUnit/bin:/c/Tools/xUnit:/c/Tools/MSpec:/c/Tools/Coverity/bin:/c/Program Files (x86)/CMake/bin:/c/go/bin:/c/Program Files/Java/jdk1.8.0/bin:/c/Python27:/c/Program Files/nodejs:/c/Program Files (x86)/iojs:/c/Program Files/iojs:/c/Users/appveyor/AppData/Roaming/npm:/c/Program Files/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/MSBuild/14.0/Bin:/c/Tools/NuGet:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/CommonExtensions/Microsoft/TestWindow:/c/Program Files/Microsoft DNX/Dnvm:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/Tools/Binn/:/c/Program Files (x86)/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files/Microsoft SQL Server/130/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/110/DTS/Binn/:/c/Program Files (x86)/Microsoft SQL Server/120/DTS/Binn/:/c/Program Files (x86)/Apache/Maven/bin:/c/Python27/Scripts:/c/Tools/NUnit3:/c/Program Files/Mercurial/:/c/Program Files/LLVM/bin:/c/Program Files/dotnet/:/c/Tools/curl/bin:/c/Program Files/Amazon/AWSCLI/:/c/Program Files (x86)/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/Extensions/Microsoft/SQLDB/DAC/140:/c/Tools/vcpkg:/c/Program Files (x86)/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/Tools/Binn/:/c/Program Files/Microsoft SQL Server/140/DTS/Binn/:/c/Program Files/PowerShell/6.0.0/:/c/Program Files/erl9.2/bin:/c/Program Files (x86)/nodejs/:/c/Program Files/Git/cmd:/c/Program Files/Git/usr/bin:/c/Program Files (x86)/Yarn/bin/:/c/Program Files (x86)/NSIS:/c/Tools/Octopus:/c/Program Files/Microsoft Service Fabric/bin/Fabric/Fabric.Code:/c/Program Files/Microsoft SDKs/Service Fabric/Tools/ServiceFabricLocalClusterManager:/c/ProgramData/chocolatey/bin:/c/Users/appveyor/AppData/Roaming/npm:/c/Users/appveyor/AppData/Local/Yarn/bin:/c/Program Files/AppVeyor/BuildAgent/:/usr/texbin' + R_BUILD_ARGS=--no-manual + R_CHECK_ARGS=--no-manual + R_USE_BIOC_CMDS='source('\''http://bioconductor.org/biocLite.R'\''); tryCatch(useDevel(TRUE), error=function(e) {if (!grepl('\''already in use'\'', e)) {e}}); options(repos=biocinstallRepos());' + COMMAND=run_tests + echo 'Running command: run_tests' Running command: run_tests + shift + case $COMMAND in + RunTests + echo 'Building with: R CMD build --no-manual' Building with: R CMD build --no-manual + [[ MSYS_ == \M\I\N\G\W ]] + [[ MSYS == \M\S\Y\S ]] + [[ -d vignettes ]] + rm -rf vignettes + Rscript -e 'd <- read.dcf('\''DESCRIPTION'\''); d[, colnames(d) == '\''VignetteBuilder'\''] <- NA; write.dcf(d, '\''DESCRIPTION'\'')' + R CMD build --no-manual . * checking for file './DESCRIPTION' ... OK * preparing 'data.table': * checking DESCRIPTION meta-information ... OK * cleaning src * checking for LF line-endings in source and make files and shell scripts * checking for empty or unneeded directories * building 'data.table_1.11.1.tar.gz' ++ ls -1t data.table_1.11.1.tar.gz ++ head -n 1 + FILE=data.table_1.11.1.tar.gz + [[ MSYS_ == \M\I\N\G\W ]] + [[ MSYS == \M\S\Y\S ]] + R_CHECK_INSTALL_ARGS=--install-args=--build + echo 'Testing with: R CMD check "data.table_1.11.1.tar.gz" --no-manual --install-args=--build' Testing with: R CMD check "data.table_1.11.1.tar.gz" --no-manual --install-args=--build + _R_CHECK_CRAN_INCOMING_=FALSE + [[ FALSE == \F\A\L\S\E ]] + echo '(CRAN incoming checks are off)' (CRAN incoming checks are off) + _R_CHECK_CRAN_INCOMING_=FALSE + R_QPDF=true + R CMD check data.table_1.11.1.tar.gz --no-manual --install-args=--build * using log directory 'C:/projects/data-table/data.table.Rcheck' * using R Under development (unstable) (2018-05-02 r74682) * using platform: x86_64-w64-mingw32 (64-bit) * using session charset: ISO8859-1 * using option '--no-manual' * checking for file 'data.table/DESCRIPTION' ... OK * this is package 'data.table' version '1.11.1' * checking package namespace information ... OK * checking package dependencies ... NOTE Packages suggested but not available for checking: 'curl' 'knitr' 'reshape2' 'testthat' * checking if this is a source package ... OK * checking if there is a namespace ... OK * checking for executable files ... OK * checking for hidden files and directories ... OK * checking for portable file names ... OK * checking whether package 'data.table' can be installed ... OK * checking installed package size ... OK * checking package directory ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking for left-over files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * loading checks for arch 'i386' ** checking whether the package can be loaded ... OK ** checking whether the package can be loaded with stated dependencies ... OK ** checking whether the package can be unloaded cleanly ... OK ** checking whether the namespace can be loaded with stated dependencies ... OK ** checking whether the namespace can be unloaded cleanly ... OK ** checking loading without being on the library search path ... OK * loading checks for arch 'x64' ** checking whether the package can be loaded ... OK ** checking whether the package can be loaded with stated dependencies ... OK ** checking whether the package can be unloaded cleanly ... OK ** checking whether the namespace can be loaded with stated dependencies ... OK ** checking whether the namespace can be unloaded cleanly ... OK ** checking loading without being on the library search path ... OK * checking dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... OK * checking Rd metadata ... OK * checking Rd cross-references ... NOTE Package unavailable to check Rd xrefs: 'fastmatch' * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking Rd contents ... OK * checking for unstated dependencies in examples ... OK * checking line endings in C/C++/Fortran sources/headers ... OK * checking line endings in Makefiles ... OK * checking compilation flags in Makevars ... OK * checking for GNU extensions in Makefiles ... OK * checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK * checking compiled code ... OK * checking examples ... ** running examples for arch 'i386' ... OK ** running examples for arch 'x64' ... OK * checking for unstated dependencies in 'tests' ... OK * checking tests ... ** running tests for arch 'i386' ... Running 'autoprint.R' Comparing 'autoprint.Rout' to 'autoprint.Rout.save' ... OK Running 'knitr.R' Comparing 'knitr.Rout' to 'knitr.Rout.save' ... OK Running 'main.R' ERROR Running the tests in 'tests/main.R' failed. Last 13 lines of output: Running test id 1253.195 Running test id 1253.196 Running test id 1253.197 Running test id 1253.198 Running test id 1253.199 Running test id 1253.2 Running test id 1253.201 Running test id 1253.202 Running test id 1253.203 Running test id 1253.204 Running test id 1253.205 Running test id 1253.206 Running test id 1253.207 Error: memory exhausted (limit reached?) Error: memory exhausted (limit reached?) Execution halted ** running tests for arch 'x64' ... Running 'autoprint.R' Comparing 'autoprint.Rout' to 'autoprint.Rout.save' ... OK Running 'knitr.R' Comparing 'knitr.Rout' to 'knitr.Rout.save' ... OK Running 'main.R' ERROR Running the tests in 'tests/main.R' failed. Last 13 lines of output: Running test id 1613.5531 Running test id 1613.5532 Running test id 1613.561 Running test id 1613.562 Running test id 1613.563 Running test id 1613.564 Running test id 1613.565 Running test id 1613.566 Running test id 1613.567 Running test id 1613.571 Running test id 1613.572 Running test id 1613.573 Running test id 1613.581 Running test id 1613.582 Error: memory exhausted (limit reached?) Execution halted * DONE Status: 2 ERRORs, 2 NOTEs See 'C:/projects/data-table/data.table.Rcheck/00check.log' for details. Command exited with code 1 7z a failure.zip *.Rcheck\* 7-Zip [64] 16.04 : Copyright (c) 1999-2016 Igor Pavlov : 2016-10-04 Scanning the drive: 26 folders, 462 files, 11121162 bytes (11 MiB) Creating archive: failure.zip Items to compress: 488 Files read from disk: 462 Archive size: 4104726 bytes (4009 KiB) Everything is Ok appveyor PushArtifact failure.zip Uploading artifact failure.zip (4,104,726 bytes)...100% ```
adamaltmejd commented 6 years ago

Dont't know if related but I just ran into this fail during the join below that has been running fine before. The dt object is around 4gb in memory but when doing this join R memory usage in Activity Monitor increases to 16... my sessioninfo below.

dt[
    degrees[
        dt,
        max.int(SunNiva),
        on = .(id, KalenderAr < admission_year),
        by = .EACHI
    ],
    tmp.SunNiva_achieved_degree := i.V1,
    on = .(id, admission_year = KalenderAr)
]
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin17.5.0 (64-bit)
Running under: macOS High Sierra 10.13.4

Matrix products: default
BLAS/LAPACK: /usr/local/Cellar/openblas/0.2.20_2/lib/libopenblas_haswellp-r0.2.20.dylib

data table: data.table 1.11.1 IN DEVELOPMENT built 2018-05-05
mattdowle commented 6 years ago

@adamaltmejd Thanks for this -- great. Could you try again with master as of now please; a few memory faults fixed in the last few days. If it still fails, is there any way to generate some dummy data for us so we've got something to run?

adamaltmejd commented 6 years ago

Just tried reinstalling with install_github("Rdatatable/data.table", build_vignettes=FALSE), same error.

Before doing that however I noticed that also my backup script that was using dplyr to get around some other data.table issues (fread and NUL bytes) also ran into the same memory error on a dplyr operation. Maybe its actually something with my data that has changed that I can't pinpoint, or with R 3.5.0 i guess. Or its related to the data table object already in memory..

Kind of tricky to reproduce, data is confidential. But will try as soon as I have some more time.

jangorecki commented 6 years ago

@mattdowle I run memtest for windows and got the following (windows uses only gc() measures, not ps). At least something that explains hitting 2GB limit, and it also confirms the range of tests when issue starting to occur. You can reproduce yourself on w1.

# x64 win r-devel
data.table::fread("wget -q https://github.com/Rdatatable/data.table/files/1995905/memtest.csv.zip && unzip -cq memtest.csv.zip"
  )[, last_GC_used := data.table::shift(GC_used)
    ][, GC_used_diff := GC_used-last_GC_used
      ][order(-GC_used_diff), head(.SD, 10), .SDcols=c("test", "GC_used", "GC_used_diff")]
#        test GC_used GC_used_diff
#       <num>   <num>        <num>
# 1: 1848.000  7875.9        441.8
# 2: 1835.000  7405.1        316.2
# 3: 1739.400  6581.1         58.6
# 4: 1739.100  6519.8         49.6
# 5: 1158.000   100.3         48.5
# 6: 1750.800  6668.4         37.4
# 7: 1549.000  2670.7         35.4
# 8: 1542.000  2597.2         35.1
# 9: 1642.001  3661.0         33.8
#10: 1779.800  6960.8         24.2

So 1848 and 1835 seems to be contributing the most to GC_used measure. Smallest test number, so potentially first one (in this top 10) is 1158 which is causing problems described in #2854.

@HughParsonage are you able to run memtest on your R-devel to confirm below numbers? just set env var TEST_DATA_TABLE_MEMTEST=TRUE and call test.data.table. Be aware single run took ~50 minutes. Ignore ps -o warnings in output (I hope they won't stop tests, I might have fixed that already in my dev branch).

Windows R Under development (unstable) (2018-05-05 r74699)

i386

memtest.csv.zip memtest

x64

memtest.csv.zip memtest

Windows R version 3.5.0 (2018-04-23)

i386

memtest.csv.zip memtest

x64

memtest.csv.zip memtest

Linux R version 3.4.4 (2018-03-15)

memtest.csv.zip memtest

HughParsonage commented 6 years ago

R dev r74708 Windows 64 bit

data.table 1.11.3 2018-05-11 01:26:09 UTC

memtest.csv.txt

jangorecki commented 6 years ago

It seems to be very fast on your machine, but also confirms memory used.

tail(data.table::fread("https://github.com/Rdatatable/data.table/files/1996044/memtest.csv.txt"), 3)
#     inittime   filename  timestamp   test PS_rss GC_used GC_gc_trigger   GC_max_used
#1: 1526054564 tests.Rraw 1526054785 1894.2     NA  8129.3        8185.1         179.2
#2: 1526054564 tests.Rraw 1526054785 1894.3     NA  8130.1        8185.9         179.2
#3: 1526054564 tests.Rraw 1526054785 1894.4     NA  8130.9        8186.7         179.2
HughParsonage commented 6 years ago

Have I done something wrong? It's a 5 year old laptop that cost about $500...

jangorecki commented 6 years ago

Not sure. 50 min on my machine was actually R CMD check and installing some deps from source, not just test.data.table. Tests and examples for i386 and x64. How much RAM you have?

HughParsonage commented 6 years ago

8 GB

adamaltmejd commented 6 years ago

Just managed to run a version of the script referenced in my post above but with a larger data set on a proper server with 120GB RAM, and it worked, using 80Gb memory at peak. Using R 3.5.0 and CRAN version of datatable (server for confidential data so not allowed to install newer version).