Closed CaptSpify closed 6 years ago
hi. OS version, perl version, device, free memory, appx. inventory size? also check syslog for oom-killer (Out of memory and killed processes).
Message "Killed" goes from your Shell, not mt-aws-glacier (see https://www.thecodingforums.com/threads/killed-message-on-linux-running-from-bash.895178/#post-4802463 ) - someone killed the process with kill -9
. It must be oom-killer.
OS: Debian 8 perl: v5.20.2 free-memory: 50G Size: 2892174934222595
It does look like oom killer. My question then: Why is it using so much memory?
Size: 2892174934222595
what does this number mean and where did you get it?
I got it from: /usr/bin/mtglacier --config myfile.cfg list-vaults
In the return, there is a field that says: Archives: 50596957, Size: 2892174934222595
Are you looking for a different size? I'm guessing that is in PetaBytes?
Archives: 50596957, Size: 2892174934222595
if those figures looks correct to you (and match what Amazon Console shows), then ok. otherwise, if it's much higher than it should be, need investigate why so.
anyway, you can try add after line 47 https://github.com/vsespb/mt-aws-glacier/blob/master/lib/App/MtAws/Glacier/ListVaults.pm#L47
debug code:
open F, ">", "/tmp/mtvaults.json";
print F $self->{rawdata};
close F;
and the from command line try to parse this JSON with perl module:
perl -MFile::Slurp -MJSON::XS -e 'print JSON::XS->new->allow_nonref->decode(read_file("/tmp/mtvaults.json"))'
and see if it crashes because of memory. and also see if other json tools can survive. and just check JSON file size in bytes.
I think Amazon invented CSV format maybe for such case.
see --request-inventory-format
option for retrieve-inventory
. maybe i'll take less RAM, maybe no.
It looks like it dies before it writes the file. Either way, I was able to get it to download by stopping pretty much all other processes on that server. I wonder if one of them was fighting for ram.
I'll try the retrieve-inventory later, but I see on the readme that mtglacier has issues with CSV files. Is that something that you think would impact this?
I'll try the retrieve-inventory later, but I see on the readme that mtglacier has issues with CSV files. Is that something that you think would impact this?
there are (or was) bugs in edge cases. "normal" users not affected, i think. also csv parsing is slower (but maybe uses less ram, maybe).
also you have 50M files and 50G RAM, so you have 1000bytes per record. one journal entry on disk takes (minimum!) 240 bytes (archive id, checksums, timestamps) + filename size. it's on disk. internal structures can take more. quite possible that you wont be able to work with such a big journal on that machine (different commands will require different amount of RAM with this journal).
also: If you successfully download-inventory
(i.e. get mtglacier journal for the inventory), you still can do some stuff without enought memory. In docs:
It's a text file. You can parse it with grep awk cut, tail etc, to extract information in case you need perform some advanced stuff, that mtglacier can't do (NOTE: make sure you know what you're doing ).
Each text line in a file represent one record
i.e. you can, for example, split it into 1000 small journals (with tools like head
or tail
) and do some simple operations - like delete all files or restore all files. Obviously you can't do complex logic like "sync only modified files" as it might require to compare local file with one recorded in journal.
Whenever I try to retrieve a job, it ends up getting killed. I'm guessing this is because it's a big download, but I'm not sure.
The command I'm running:
/usr/bin/mtglacier download-inventory --config myconfigfile --vault myvaultname --new-journal /my/dir/journalfilename.log
The results I'm getting:
Let me know if there's any other information needed