vitrivr / cineast

Cineast is a multi-feature content-based mulitmedia retrieval engine. It is capable of retrieving images, audio- and video sequences as well as 3d models based on edge or color sketches, textual descriptions and example objects.
MIT License
56 stars 51 forks source link

Coredump for OCRSearch feature extraction #359

Open sauterl opened 1 year ago

sauterl commented 1 year ago

In case others have this issue as well, I document it here:

Using the OCRSearch feature in an extraction configured as follows:

{
    "input":{
        "path": "path/to/videos/",
        "depth": 1,
        "skip": 0,
        "id": {
            "name": "FileNameObjectIdGenerator",
            "properties": {}
        }
    },
    "extractors":[
        {"name": "OCRSearch"}
    ],
    "metadata": [
        {"name": "TechnicalVideoMetadataExtractor"},
        {"name": "EXIFMetadataExtractor"}
    ],
    "exporters":[
        {
            "name": "ShotThumbnailsExporter",
            "properties": {
                "destination":"thumbnails/"
            }
        }
    ],
    "segmenter": {
        "name":"org.vitrivr.cineast.core.extraction.segmenter.video.V3CMSBSegmenter",
        "properties": {
            "folder": "path/to/msb"
        }
    },
    "database": {
        "writer": "JSON",
        "selector": "NONE",
        "host":"./ocr"
    },
    "pipeline":{
        "shotQueueSize": 20,
        "threadPoolSize": 20,
        "taskQueueSize": 20
    }
}

after roughly 7000 segments, a coredump stopped the extraction, which was executed using:

java -Xmx32G -Xms32G -jar cineast-runtime/build/libs/cineast-runtime- ../cineast.json extract -e ../extraction-most.json`

This occurred on a Ubuntu 20.04.5 LTS, using openjdk

openjdk version "17.0.5" 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed mode, sharing)

with 40 cores of type Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz

GPU: NVIDIA GeForce RTX 2080 TI, nvidia-driver 450.51.06 and CUDA 11.0.228

lucaro commented 1 year ago

and what did the core dump say?

sauterl commented 1 year ago

Unfortunately there is no further information than a generic core dumped message. I did check in the user's home or the working direktory and none of the usual error logs appeared. My assumption (since a proper java coredump would produce a log file), that some native library had an issue.

lucaro commented 1 year ago

Yes, that's usually the only way to trigger a core dump. It should still dump something though. If it were killed by the OS before it was able to do so, you would get a different message.

silvanheller commented 1 year ago

Probably the same issue as https://github.com/vitrivr/cineast/issues/273, where we did not have sufficient information to reproduce the bug. Thanks for the information!

x4e-jonas commented 1 year ago

I'm observing a similar issue, thus I'm not sure if this is the same as described here or in #273.

It seems to me, that the extraction process loads the objects into memory one-by-one. As soon as the sum of objects is larger than the available memory, the process goes OOM. This happens after a some amount of large images or in case of videos (they tend to be several hundreds of MBs) after a hand full already. Allocating more memory (up to 32G or so) does not solve this issue but allows to run the extraction a bit longer. The only workaround so far is to use a wrapper script that shards the files into a sequence of multiple distinct extraction procedures.

lucaro commented 1 year ago

If this is indeed the problem, possible workarounds would include:

  1. making the extraction thread pool size smaller in order to reduce the number of feature instances operating simultaneously
  2. reducing the video resolution in the decoder settings

Can you check if one or both of these measures resolves your issue?

x4e-jonas commented 1 year ago
  1. results in

    DEBUG o.v.c.s.r.GenericExtractionItemHandler - ExtractionPipeline is full - deferring emission of segment. Consider increasing the thread-pool count for the extraction pipeline.

lucaro commented 1 year ago

This is expected behavior in this case and what you would want to happen. This only tells you that the feature extraction is slower than the decoder, and if you have more compute resources, you could increase throughput. In case you are compute - or in this case memory - limited, you'd want the pipeline to slow down rather than trying to consume more resources than are available. Were you able to run your extraction without anything crashing?

x4e-jonas commented 1 year ago

The video resolution is already limited to 640x480. The affected instance has 32 CPU cores, 64G RAM, 4352 GPU cores and 11G GRAM. What parameters would you recommend to run at full capacity?

lucaro commented 1 year ago

Whatever works 😉 The question cannot be answered as posed since the answer also depends on the content to be processed. Is 640x480 enough for all the text to be readable? If yes, can you go lower and still have it readable? If not, how much larger do you need to go? How long is the mean shot of your content? Does it have a lot of short sequences, or is it more of one continuous recording? The limiting factor you have here is the longest shot you need to keep in memory, so the total maximum memory consumption is proportional to the duration of the longest shot times the video resolution. As long as you can keep this below your hardware limit, you should not experience any problems.