MediaArea / RAWcooked

Encodes RAW audio-visual data into the Matroska container (MKV), using the video codec FFV1 for the image and audio codec FLAC for the sound.
https://mediaarea.net/RAWcooked
BSD 2-Clause "Simplified" License
43 stars 11 forks source link

Computer build recommendation? #375

Open dlasusa opened 2 years ago

dlasusa commented 2 years ago

Hi All,

My apologies if this isn't the right place to ask.

TL;DR version: Are there any recommended build guides or system specs for building a machine specifically to maximize the performance of RawCooked? Or what resources to focus on? (cpu threads, cpu frequency, memory, gpu, storage I/O, etc.)

A bit more detail:

I've been testing RawCooked and while watching my system resources I found (sorry for the subjective numbers/wording)

Command used: RAWcooked.exe --all ./dpx_2k_files/ -o G:\temp\rc_dpx_2k.mkv (this is reading from one SSD and writing to another - both drives are local)

During analysis: 50% read speed from source CPU is only "semi-busy" (plenty of headroom for more work) No visible bottlenecks

Then a long pause with no indication of what it's doing during which: very low cpu usage source drive still at 50% read (plenty of headroom to read faster) No visible bottlenecks

Once ffmpeg starts good cpu usage (50% over all cores) no source drive activity low destination drive activity (24%) CPU wasn't pinned, but it was the only thing doing any real work

Long pause at end of ffmpeg good cpu usage minimal activity elsewhere No visible bottlenecks

System Specs:

Just wondering if I were to upgrade this system or get a new system, specifically for RawCooked, if there is anything I should be focusing on?

Thank you! And thank you for a nice (and very useful) piece of software!

Dan

JeromeMartinez commented 2 years ago

My apologies if this isn't the right place to ask.

This is the right place to ask.

Are there any recommended build guides or system specs for building a machine specifically to maximize the performance of RawCooked? Or what resources to focus on? (cpu threads, cpu frequency, memory, gpu, storage I/O, etc.)

Currently there is no a complete reference for that, we mainly recommend to have good I/O on separate fast (SSD) disks (or network) able to read and write at the speed the CPU can handle. We base our estimation on 50 Mbps of content per core per Ghz per pass (analysis pass is expected to be fast compared to others). content is width x height x components x bit depth x frame rate, so e.g. HDTV content is 1920x1080x3x10x30 = ~1900 Mbps, for realtime processing you need (for 1 pass):

That said, main issue is to know how much content you have to compress, in how much time. Because real time is not needed, e.g. if you have 100 hours of content to handle in 1 month, a computer able to compress at 0.2x real time is enough.

Just wondering if I were to upgrade this system or get a new system, specifically for RawCooked, if there is anything I should be focusing on?

Coherency between CPU and storage speed. e.g. real world performance of your SSD is around 300 MB/s so having a bigger CPU would not help (blocked by I/O) and having a faster SSD would not help too (blocked by CPU).

During analysis: 50% read speed from source

This part is opening all DPX files, and we plan to work on having more processing in parallel.

Then a long pause with no indication of what it's doing during which:

Weird, but I guess that this is still analysis.

Once ffmpeg starts good cpu usage (50% over all cores)

This may be due to issues with RAM bandwidth there, but difficult to say. Definitely more work to do on that part.

Long pause at end of ffmpeg

this is the 2nd pass, decode and check reversibility, linked to CPU

(though I don't think RawCooked uses the GPU.

GPU not yet used but we try to.

We started by having a proof of concept, we have now the proof that the tool is useful for lot of people, and we plan to work on improving speed, but no ETA (long todo-list with also other projects).

Note that in the meanwhile some entities use RAWcooked with GNU Parallel in order to run several instances of RAWcooked at the same time, it waives a lot the slowdown of one instance and would put your CPU and you SSD at its maximum.

Thank you! And thank you for a nice (and very useful) piece of software!

Thank you :).

dlasusa commented 2 years ago

Thank you Jerome! A lot of fantastic and useful information in your reply! Much appreciated!