daitss / core

DAITSS: Dark Archive In The Sunshine State
GNU General Public License v3.0
9 stars 2 forks source link

Memory spikes while ingesting packages with .mov files #778

Closed szanati closed 8 years ago

szanati commented 8 years ago

Last night while ingesting only 1 package with a size of 4.8GB and only 4 files with a .mov file and a .mp4 file, the memory jumped up to: Physical 40023M 40217M 99% Actual 33056M 40217M 82% Swap 16383M 16383M 100% DAITSS crashed about 3:17 am. Crontab doesn't kick off until after 4:am. We had a similar issue back in December with .mov files see GIT HUB issue 775. At the time we thought it was the way the package was put together. This time it is from a different affiliate who sends us stuff on a regular basis so I know it is not how the package is put together. The only theme for the 2 issues is the .mov files.

jonpitts commented 8 years ago

One way to know for sure.

https://github.com/daitss/transform/blob/master/daitss-config.example.yml#L56 Extract the .mov files from the package and have someone run lqt_transcode on the daitss server while watching the memory.

cchou commented 8 years ago

Which process is taking up the memory? Which step was Ingest on? If it's format specific, there are usually two possibilities: 1) metadata extraction and processing 2) transcoding. Looking at the WIP usually can give us an idea of where ingest fails. If it's metadata extraction and processing, it's usually in DROID, JHOVE or building AIP descriptor. If it's transcoding, you can run lqt_transcode command to see if it blows up. Transcoding tools including lqt_transcode may need an update as they are all several years old already.

szanati commented 8 years ago

Thanks Jonathan and Carol. I am first trying the package again to see if the memory spikes. If it does I will get Darryl run lqt_transcode on the .mov file to see if the memory goes up also. If it does than I will mention to Darryl that the lqt_transcode may need to be update. It seems that the past few memory spikes have in common is they have .mov files in the packages.

cchou commented 8 years ago

Recommend to archive this package by disabling MOV normalization.

There was a bug in disabling normalization. Fixes are put in in commit dde0da07e43b466acaa0d5579efbe11c2b04d0d0 and commit https://github.com/daitss/transform/commit/573635188a9001ed01e0fd595ce84f4648b30e81