vsespb / mt-aws-glacier

Perl Multithreaded Multipart sync to Amazon Glacier
http://mt-aws.com/
GNU General Public License v3.0
536 stars 57 forks source link

stop while uploading #105

Open pqkhanhvn opened 9 years ago

pqkhanhvn commented 9 years ago

I use mt-aws-glacier version 1.120 to upload a file 6GB to Glacier but the uploading process STOP without any error messages. below is command line and log $mtglacier sync --config=glacier.cfg --dir /storage/DATA --vault=data --journal=journal.info --concurrency=4 --partsize=64

MT-AWS-Glacier, Copyright 2012-2014 Victor Efimov http://mt-aws.com/ Version 1.120

PID 7123 Started worker PID 7124 Started worker PID 7125 Started worker PID 7126 Started worker PID 7124 Created an upload_id StFe-J5vFcAsOYwO5FCPtqHJTyZGeJZHnRse5vgr2epw3iCMH--bDzRgbdIe5pwCW4S56QBzquResYHEoBvaXotjYVtS PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [0] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [201326592] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [134217728] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [67108864] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [335544320] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [469762048] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [268435456] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [402653184] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [738197504] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [536870912] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [603979776] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [671088640] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [872415232] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [805306368] PID 7125 HTTP 408 This might be normal. Will retry (322 seconds spent for request) PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1073741824] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [939524096] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1207959552] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1275068416] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1140850688] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1006632960] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1342177280] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1409286144] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1476395008] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1543503872] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1677721600] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1744830464] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1610612736] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1811939328] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1879048192] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2080374784] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [1946157056] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2013265920] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2281701376] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2214592512] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2147483648] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2348810240] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2550136832] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2415919104] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2483027968] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2617245696] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2684354560] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2751463424] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2818572288] PID 7125 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [3087007744] PID 7124 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2885681152] PID 7123 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [3019898880] PID 7126 Uploaded part for DATA/20141225_14_1_DATA.tar.gz.gpg at offset [2952790016]

The upload process STOP at the last log line without any error message. Could you please look into this problem? Thanks for the great tool!

vsespb commented 9 years ago

1) need strace -p $PID for each of last pids 7126,7123,7124 etc

2) is the problem repeatable ?

pqkhanhvn commented 9 years ago

1) This is strace for the PID root@backup:/home/glacier# strace -p 7126 attach: ptrace(PTRACE_ATTACH, ...): No such process root@backup:/home/glacier# strace -p 7123 attach: ptrace(PTRACE_ATTACH, ...): No such process root@backup:/home/glacier# strace -p 7124 attach: ptrace(PTRACE_ATTACH, ...): No such process root@backup:/home/glacier# strace -p 7125 attach: ptrace(PTRACE_ATTACH, ...): No such process

2) Yes, the problem is repeatable. It stops randomly, not base on a period of time.

vsespb commented 9 years ago

I need 1) perl -MJSON::XS -E 'say JSON::XS->VERSION' 2) perl -MDigest::SHA -E 'say Digest::SHA->VERSION' 3) perl -V (note capital V) 4) your OS and distro version 5) run echo $? in same terminal, after you see failure again 6) check syslog for OOM errors (out of memory) 7) Is it possible that there is no enough memory during run ?

pqkhanhvn commented 9 years ago

1) root@backup:/home/glacier# perl -MJSON::XS -E 'say JSON::XS->VERSION' 2.32 2) root@backup:/home/glacier# perl -MDigest::SHA -E 'say Digest::SHA->VERSION' 5.61 3) root@backup:/home/glacier# perl -V Summary of my perl5 (revision 5 version 14 subversion 2) configuration:

Platform: osname=linux, osvers=2.6.42-26-generic, archname=i686-linux-gnu-thread-multi-64int uname='linux roseapple 2.6.42-26-generic #41-ubuntu smp thu jun 14 17:49:24 utc 2012 i686 i686 i386 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i686-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.14 -Darchlib=/usr/lib/perl/5.14 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.14.2 -Dsitearch=/usr/local/lib/perl/5.14.2 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.14.2 -des' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.6.3', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=, so=so, useshrplib=true, libperl=libperl.so.5.14.2 gnulibc_version='2.15' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_PERL_ATOF USE_REENTRANT_API Locally applied patches: DEBPKG:debian/arm_thread_stress_timeout - http://bugs.debian.org/501970 Raise the timeout of ext/threads/shared/t/stress.t to accommodate slower build hosts DEBPKG:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN. DEBPKG:debian/db_file_ver - http://bugs.debian.org/340047 Remove overly restrictive DB_File version check. DEBPKG:debian/doc_info - Replace generic man(1) instructions with Debian-specific information. DEBPKG:debian/enc2xs_inc - http://bugs.debian.org/290336 Tweak enc2xs to follow symlinks and ignore missing @INC directories. DEBPKG:debian/errno_ver - http://bugs.debian.org/343351 Remove Errno version check due to upgrade problems with long-running processes. DEBPKG:debian/libperl_embed_doc - http://bugs.debian.org/186778 Note that libperl-dev package is required for embedded linking DEBPKG:fixes/respect_umask - Respect umask during installation DEBPKG:debian/writable_site_dirs - Set umask approproately for site install directories DEBPKG:debian/extutils_set_libperl_path - EU:MM: Set location of libperl.a to /usr/lib DEBPKG:debian/no_packlist_perllocal - Don't install .packlist or perllocal.pod for perl or vendor DEBPKG:debian/prefix_changes - Fiddle with _PREFIX and variables written to the makefile DEBPKG:debian/fakeroot - Postpone LD_LIBRARY_PATH evaluation to the binary targets. DEBPKG:debian/instmodsh_doc - Debian policy doesn't install .packlist files for core or vendor. DEBPKG:debian/ld_run_path - Remove standard libs from LD_RUN_PATH as per Debian policy. DEBPKG:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable. DEBPKG:debian/m68k_thread_stress - http://bugs.debian.org/517938 http://bugs.debian.org/495826 Disable some threads tests on m68k for now due to missing TLS. DEBPKG:debian/mod_paths - Tweak @INC ordering for Debian DEBPKG:debian/module_build_man_extensions - http://bugs.debian.org/479460 Adjust Module::Build manual page extensions for the Debian Perl policy DEBPKG:debian/prune_libs - http://bugs.debian.org/128355 Prune the list of libraries wanted to what we actually need. DEBPKG:fixes/net_smtp_docs - [rt.cpan.org #36038] http://bugs.debian.org/100195 Document the Net::SMTP 'Port' option DEBPKG:debian/perlivp - http://bugs.debian.org/510895 Make perlivp skip include directories in /usr/local DEBPKG:debian/disable-zlib-bundling - Disable zlib bundling in Compress::Raw::Zlib DEBPKG:debian/cpanplus_definstalldirs - http://bugs.debian.org/533707 Configure CPANPLUS to use the site directories by default. DEBPKG:debian/cpanplus_config_path - Save local versions of CPANPLUS::Config::System into /etc/perl. DEBPKG:debian/deprecate-with-apt - http://bugs.debian.org/580034 Point users to Debian packages of deprecated core modules DEBPKG:fixes/hurd-ccflags - [a190e64] http://bugs.debian.org/587901 [perl #92244] Make hints/gnu.sh append to $ccflags rather than overriding them DEBPKG:debian/squelch-locale-warnings - http://bugs.debian.org/508764 Squelch locale warnings in Debian package maintainer scripts DEBPKG:debian/skip-upstream-git-tests - Skip tests specific to the upstream Git repository DEBPKG:fixes/extutils-cbuilder-cflags - [011e8fb] http://bugs.debian.org/624460 [perl #89478] Append CFLAGS and LDFLAGS to their Config.pm counterparts in EU::CBuilder DEBPKG:fixes/module-build-home-directory - http://bugs.debian.org/624850 [rt.cpan.org #67893] Fix failing tilde test when run under a UID without a passwd entry DEBPKG:debian/patchlevel - http://bugs.debian.org/567489 List packaged patches for 5.14.2-6ubuntu2.1 in patchlevel.h DEBPKG:fixes/h2ph-multiarch - [e7ec705] http://bugs.debian.org/625808 [perl #90122] Make h2ph correctly search gcc include directories DEBPKG:fixes/index-tainting - [3b36395] http://bugs.debian.org/291450 [perl #64804] RT 64804: tainting with index() of a constant DEBPKG:debian/skip-kfreebsd-crash - http://bugs.debian.org/628493 [perl #96272] Skip a crashing test case in t/op/threads.t on GNU/kFreeBSD DEBPKG:fixes/document_makemaker_ccflags - http://bugs.debian.org/628522 [rt.cpan.org #68613] Document that CCFLAGS should include $Config{ccflags} DEBPKG:fixes/sys-syslog-socket-timeout-kfreebsd.patch - http://bugs.debian.org/627821 [rt.cpan.org #69997] Use a socket timeout on GNU/kFreeBSD to catch ICMP port unreachable messages DEBPKG:fixes/hurd-hints - http://bugs.debian.org/636609 Improve general GNU hints, needed for GNU/Hurd. DEBPKG:fixes/podfixes - [7698aed] http://bugs.debian.org/637816 Fix typos in several pod/perl.pod files DEBPKG:debian/find_html2text - http://bugs.debian.org/640479 Configure CPAN::Distribution with correct name of html2text DEBPKG:fixes/digest_eval_hole - http://bugs.debian.org/644108 Close the eval "require $module" security hole in Digest->new($algorithm) DEBPKG:fixes/hurd-ndbm - [f0d0a20] [perl #102680] http://bugs.debian.org/645989 Add GNU/Hurd hints for NDBM_File DEBPKG:fixes/sysconf.t-posix - [8040185] [perl #102888] http://bugs.debian.org/646016 Fix hang in ext/POSIX/t/sysconf.t on GNU/Hurd DEBPKG:fixes/hurd-largefile - [1fda587] [perl #103014] http://bugs.debian.org/645790 enable LFS on GNU/Hurd DEBPKG:debian/hurd_test_todo_syslog - http://bugs.debian.org/650093 Disable failing GNU/Hurd tests in cpan/Sys-Syslog/t/syslog.t DEBPKG:fixes/hurd_skip_itimer_virtual - [rt.cpan.org #72754] http://bugs.debian.org/650094 Skip interval timer tests in Time::HiRes on GNU/Hurd DEBPKG:debian/hurd_test_skip_socketpair - http://bugs.debian.org/650186 Disable failing GNU/Hurd tests ext/Socket/t/socketpair.t DEBPKG:debian/hurd_test_skip_sigdispatch - http://bugs.debian.org/650188 Disable failing GNU/Hurd tests op/sigdispatch.t DEBPKG:debian/hurd_test_skip_stack - http://bugs.debian.org/650175 Disable failing GNU/Hurd tests dist/threads/t/stack.t DEBPKG:debian/hurd_test_skip_recv - http://bugs.debian.org/650095 Disable failing GNU/Hurd tests cpan/autodie/t/recv.t DEBPKG:debian/hurd_test_skip_libc - http://bugs.debian.org/650097 Disable failing GNU/Hurd tests dist/threads/t/libc.t DEBPKG:debian/hurd_test_skip_pipe - http://bugs.debian.org/650187 Disable failing GNU/Hurd tests io/pipe.t DEBPKG:debian/hurd_test_skip_io_pipe - http://bugs.debian.org/650096 Disable failing GNU/Hurd tests dist/IO/t/io_pipe.t Built under linux Compiled at Aug 10 2012 21:26:09 @INC: /etc/perl /usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 /usr/local/lib/site_perl .

4a) root@backup:/home/glacier# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 12.04.1 LTS Release: 12.04 Codename: precise

4b) root@backup:/home/glacier# uname -a Linux backup 3.2.0-29-generic-pae #46-Ubuntu SMP Fri Jul 27 17:25:43 UTC 2012 i686 i686 i386 GNU/Linux

5) root@backup:/home/glacier# echo $? 0 6) There ISN'T any message related to memory in these syslog* 7) This may be a reason because my computer has only 1GB RAM but I have uploaded to Amazon Glacier 1.3TB on this computer by mt-aws-glacier version 1.112 without this problem. This problem just occurs when I update mt-aws-glacier to the last version 1.120. I will terminate all other programs on this computer to increase free memory for mt-aws-glacierand try again.

Thanks for your support!

pqkhanhvn commented 9 years ago

I stop all other programs. Free memory is 370MB. I reduce concurrency parameter to 2 but the problem still occurs. (Free memory > concurrency*partsize) $mtglacier sync --config=glacier.cfg --dir /storage/DATA --vault=data --journal=journal.info --concurrency=2 --partsize=64

vsespb commented 9 years ago

Could you pls try branch debug_for_issue_105 - https://github.com/vsespb/mt-aws-glacier/tree/debug_for_issue_105

I've added some debugging.

vsespb commented 9 years ago

Also, pls try running strace mtglacier ... and paste last lines of output here, after process gone.

pqkhanhvn commented 9 years ago

I've cloned new branch debug_for_issue_105 on the same machine and uploaded 50GB without this problem. I will continue uploading remain data and will update you if it fires error messages.

pqkhanhvn commented 7 years ago

I got this problem again with branch debug_for_issue_105. Uploading processes stopped without error message. It repeats many times. Below is upload log: PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [0] PID 6621 HTTP connection problem (timeout?). Will retry (20 seconds spent for request) PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [67108864] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [134217728] PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [201326592] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [268435456] PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [335544320] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [402653184] PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [469762048] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [536870912] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [671088640] PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [603979776] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [738197504] PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [805306368] PID 6622 Uploaded part for DATA/part1.tar.gz.gpg at offset [939524096] PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [872415232]

This is my environment information:

Could you please fix this problem? Thank you for the great tool.

vsespb commented 7 years ago

so, "PID 6621 Uploaded part for DATA/part1.tar.gz.gpg at offset [872415232]" - is the very last line in outpout?

pqkhanhvn commented 7 years ago

yes PID 6621 is the last line.

vsespb commented 7 years ago

again, last i asked two yrs ago - pls check syslog/dmsg for Out of memroy messages or OOM Killer, Segfaults, etc

pqkhanhvn commented 7 years ago

I checked log files but didn't found Out of memory message. I am uploading now and the system frees 158M

root@backup:/var/log# free total used free shared buffers cached Mem: 2055684 1897656 158028 4672 274024 1201712 -/+ buffers/cache: 421920 1633764 Swap: 1030140 1388 1028752

Please advise!

pqkhanhvn commented 7 years ago

I've expanded double RAM for the system but uploading process still stopped with error messages as attached image

Could you please fix it? Thank you very much! Great tool.

vsespb commented 7 years ago

yes, but fix what? sha256 computation works without flaws forears for me and other users.

you started the ticket at y2014. was this same 1) hardware 2) software like now?

3) pls check hardware with memtest86 or memtester ( http://manpages.ubuntu.com/manpages/xenial/man8/memtester.8.html )

last signature error could be because of broken RAM (besides you've just added new RAM banks)

pqkhanhvn commented 7 years ago

Memory passed memtest86 but the problem still happens. I have also built another machine Ubuntu 14, same perl version but still facing with the problem. Below is software version on the new computer

image

Please advise!

vsespb commented 7 years ago

you said "It repeats many times." but how often? every 1/10 minutes/hours? or less often?

pqkhanhvn commented 7 years ago

It is random. This is information collected from my log.

Fri Jul 14 19:29:41 ICT 2017 :Stopped after: 3820 (seconds) Fri Jul 14 20:57:11 ICT 2017 :Stopped after: 3310 (seconds) Fri Jul 14 23:26:29 ICT 2017 :Stopped after: 1288 (seconds) Sat Jul 15 07:37:12 ICT 2017 :Stopped after: 2171 (seconds) Sat Jul 15 08:32:29 ICT 2017 :Stopped after: 1948 (seconds) Sat Jul 15 11:41:16 ICT 2017 :Stopped after: 5235 (seconds) Sat Jul 15 13:47:38 ICT 2017 :Stopped after: 2257 (seconds) Sat Jul 15 21:31:30 ICT 2017 :Stopped after: 5189 (seconds) Sun Jul 16 07:41:47 ICT 2017 :Stopped after: 526 (seconds) Sun Jul 16 14:27:44 ICT 2017 :Stopped after: 16422 (seconds) Sun Jul 16 20:37:13 ICT 2017 :Stopped after: 13272 (seconds) Mon Jul 17 04:45:06 ICT 2017 :Stopped after: 23225 (seconds) Mon Jul 17 11:43:31 ICT 2017 :Stopped after: 9030 (seconds) Mon Jul 17 13:54:11 ICT 2017 :Stopped after: 7630 (seconds) Mon Jul 17 15:13:23 ICT 2017 :Stopped after: 4582 (seconds) Mon Jul 17 16:52:41 ICT 2017 :Stopped after: 5320 (seconds) Mon Jul 17 22:18:19 ICT 2017 :Stopped after: 18918 (seconds) Tue Jul 18 12:48:47 ICT 2017 :Stopped after: 14866 (seconds) Tue Jul 18 14:43:24 ICT 2017 :Stopped after: 6203 (seconds) Tue Jul 18 17:40:58 ICT 2017 :Stopped after: 10137 (seconds) Tue Jul 18 21:00:03 ICT 2017 :Stopped after: 3722 (seconds) Fri Jul 21 18:20:55 ICT 2017 :Stopped after: 3054 (seconds) Sat Jul 22 00:31:19 ICT 2017 :Stopped after: 10698 (seconds) Mon Jul 24 17:35:37 ICT 2017 :Stopped after: 2136 (seconds) Tue Jul 25 01:20:24 ICT 2017 :Stopped after: 18563 (seconds) Tue Jul 25 14:07:24 ICT 2017 :Stopped after: 143 (seconds) Wed Jul 26 19:35:21 ICT 2017 :Stopped after: 5660 (seconds) Wed Jul 26 20:55:56 ICT 2017 :Stopped after: 1795 (seconds) Thu Jul 27 18:36:45 ICT 2017 :Stopped after: 9944 (seconds) Fri Jul 28 00:48:48 ICT 2017 :Stopped after: 8327 (seconds)

bpmckinnon commented 5 years ago

Hi, I just started using the software and I'm getting a similar issue. I'll try and get you some useful debug info on my next run. For me it repeats every few hours, and I'm running it on many smallish file (15000 uploads for 50GB). So far I have:

dmesg [158309.634505] Out of memory: Kill process 15899 (mdadm) score 870 or sacrifice child [158309.634510] Killed process 15899 (mdadm) total-vm:1818512kB, anon-rss:1801668kB, file-rss:0kB

I've restarted the process with strace and I'll let you know if I find anything.

bpmckinnon commented 5 years ago

One thing I've noticed is that the first perl process is holding onto quite a bit of memory, given that I'm running it on a old server with 2GB of ram.

root 16199 1.5 0.0 5204 1276 pts/0 S 11:18 0:09 strace -o mtglacier.strace /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg -- root 16201 4.1 4.1 2734840 83748 pts/0 S 11:18 0:26 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16202 0.4 1.8 101152 36396 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16203 0.4 1.8 101964 37336 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16204 0.4 1.7 100104 35552 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16205 0.4 1.9 102984 38460 pts/0 S 11:18 0:03 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16206 0.4 1.7 100240 35464 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16207 0.4 1.9 103400 38704 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16208 0.4 1.8 101428 36724 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16209 0.4 1.7 100416 35764 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16210 0.4 1.9 103036 38408 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa root 16211 0.4 1.9 102776 38296 pts/0 S 11:18 0:02 perl /usr/local/src/mt-aws-glacier/mtglacier sync --new --config=/usr/local/bin/glacier/samba-glacier.cfg --filter=-glacier-journa

bpmckinnon commented 5 years ago

I have 2 strace stacks. Let me know if you want the full files (I'm not 100% how much data is in there, so I don't know that I just post the entire file). The first is: write(16, "00000036", 8) = 8 write(16, "16493\tupload_part\t9142\t354\t40024"..., 36) = 36 write(16, "{\"mtime\":1279327706,\"partfinal"..., 354) = 354 write(16, "RIFF\26\270b\2AVI LISTR\3\0\0hdrlavih8\0\0\0"..., 40024094) = 40024094 select(24, [3 4 5 7 9 11 13 15 17 19], NULL, NULL, NULL) = 1 (in [9]) read(9, "00000026", 8) = 8 read(9, "16498\tresponse\t9141\t238\t0\n", 26) = 26 read(9, "{\"console_out\":\"Created an uploa"..., 238) = 238 write(1, "PID 16498 Created an upload_id K"..., 124) = 124 mmap(NULL, 268439552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0x15c94000) = 0x5c94000 mmap(NULL, 268570624, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 268439552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) The second is: read(32, "", 8192) = 0 brk(0x40f0000) = 0x40f0000 brk(0x41f0000) = 0x41f0000 write(10, "00000036", 8) = 8 write(10, "16201\tupload_part\t8582\t296\t62247"..., 36) = 36 write(10, "{\"relfilename\":\"Pre-2012 incl. e"..., 296) = 296 write(10, "RIFF>\323\265\3AVI LISTR\3\0\0hdrlavih8\0\0\0"..., 62247750) = 62247750 select(24, [3 4 5 7 9 11 13 15 17 19], NULL, NULL, NULL) = 1 (in [19]) read(19, "00000026", 8) = 8 read(19, "16211\tresponse\t8576\t118\t0\n", 26) = 26 read(19, "{\"console_out\":\"Uploaded part fo"..., 118) = 118 write(1, "PID 16211 Uploaded part for Pre-"..., 96) = 96 munmap(0x7fb517273000, 268439552) = 0 mmap(NULL, 268439552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0x1414e000) = 0x41f0000 mmap(NULL, 268570624, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 268439552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) brk(0x1414e000) = 0x41f0000 mmap(NULL, 268570624, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fb51f274000 Let me know if I there is a way to provide more precise error data.

vsespb commented 5 years ago

It's not similar issue. Please start new issue. And I am not sure what I should do here if there is no enough RAM for processing..