obsproject / obs-studio

OBS Studio - Free and open source software for live streaming and screen recording
https://obsproject.com
GNU General Public License v2.0
60.59k stars 8.01k forks source link

OBS Studio (various versions) fail to record using nvenc (various versions) randomly on Windows 10 #8009

Closed Yamakuzure closed 1 year ago

Yamakuzure commented 1 year ago

Operating System Info

Windows 10

Other OS

No response

OBS Studio Version

Other

OBS Studio Version (Other)

28.0.3, 28.1.2, 29.0.0_beta1, 29.0.0_beta3

OBS Studio Log URL

Full log in dump archive, see below

OBS Studio Crash Log URL

No response

Expected Behavior

Press hotkey, do recording, press hotkey, recording stops and video is fine.

Current Behavior

Please see below, this can not be described in a few simple sentences.

Steps to Reproduce

  1. Press hotkey to start recording
  2. record video
  3. Press hotkey to stop recording

Anything else we should know?

(Renewal of #7946, which was about a different issue.)

Symptoms:


What I tried so far:

Rolled back and tested combinations of: Nvidia drivers

with OBS Studio:

Also, to make sure this is not some orphaned file from some borked driver update interfering, I have booted into safe mode and used DDU to fully remove Intel UHD and nvidia drivers. After another reboot into safe mode I used tweaks.com windows repair and let the full repair package run.


Hardware:

Dell Precision 7550 with custom upgrades. CPU: Intel(R) Core(TM) i7-10875H RAM: 2x 32GB Dual Channel 3GHz (Plus 2 empty slots. Nothing soldered or single channel) GFX: Quadro T2000, 4GB RAM Storage: 2x 1TB Class 50 NVME


OS:

Windows 10


Dump files and log, made with nvidia drivers 527.27 (11/28/2022) and OBS Studio 29.0.0_beta3:

https://mega.nz/file/iFgnUL4C#j2wtv65vUaChG08M7-rGM50PioruOQb6bLbdZJ5KueQ The contents of the archive are:


I have recorded several thousand clips without any problems but #6062, and after that was fixed I never had any issue until sometime in November. This also means that I have recorded dozens of clips with the combination of OBS Studio 28.0.3 and Nvidia drivers R515 U4 (517.40). The really mean and annoying detail is, that I can not guarantee that said combination never had any issue, because this happens so extremely randomly, that I just might have been lucky in October.

So my next step will be to roll back to OBS Studio 27.2.4 with Nvidia drivers R515 U2 (516.59) from June 29, 2022, because that is the last combination I know works for sure.

I hope my dumps can help you guys! Thank you a lot for all of your hard work!

Fenrirthviti commented 1 year ago

in the future, please provide at least one log file that isn't hosted as a third party archive. Also, be mindful when sharing dump files, as they contain all information that OBS might have in memory, including potentially sensitive information such as stream keys.

Yamakuzure commented 1 year ago

in the future, please provide at least one log file that isn't hosted as a third party archive. Also, be mindful when sharing dump files, as they contain all information that OBS might have in memory, including potentially sensitive information such as stream keys.

Alright, I see to it that a new log file is generated. I do not stream and have no sensitive information in neither OBS nor this Windows machine at all, so I hope I am safe. 😬

...one moment...

https://obsproject.com/logs/t6GgAX64K2SLwi91 - Sixth recording. File has 1K. No idea whether this is helpful or not, this time the recording stopped after a few seconds, but apart from that the symptoms were the same. "Total Data Output" and "Bitrate" being zwero, and "Disk full in (approx.) showing a large negative number.

Edit: The recording stopped when I Alt-Tabbed out of the full screen game to get hold on the OBS GUI. This time I recorded Mass Effect 2 Legendary Edition. When I had this "Stopping...."-freeze issue with Mass Effect 1 Legendary Edition, the recording did almost never stop unless killed with Task Manager.

Edit 2: Interestingly in Mass Effect 3 Legendary I had to use the Task Manager to kill obs+ffmpeg again. Only ME2:LE seems to cause it to stop by itself when Alt-Tabbing out. But contrary to what I wrote above, I went back to Nvidia drivers R510 U6 (512.78) from May 16, 2022, but kept OBS 29.0.0_beta3, after generating the log linked above. So I daresay the freezing has nothing to do with the used driver version... I'll downgrade OBS next to see what happens.

Yamakuzure commented 1 year ago

Just tested the new OBS-29.0.0. First recording went through just fine, second froze. Here is the log. The line "==== Shutting down =========" is the result of me killing the obs64.exe process in task manager. The output file 'D:/Shadow/video/obs/2023-01-12_09-09-56.mkv' was never written.

Edit: Let's go through the log snippet of the failed recording

09:09:56.849: ==== Recording Start ===============================================
09:09:56.849: [ffmpeg muxer: 'adv_file_output'] Writing file 'D:/Shadow/video/obs/2023-01-12_09-09-56.mkv'...

Here a whole bunch of stuff is simply missing. In the log you can see that after the successful recording logged above, many entries labeled [game-capture: 'Mass Effect 2 LE'] should have followed, but they never showed up

09:10:20.140: Stopping recording due to hotkey

Now [ffmpeg muxer: 'adv_file_output'] should log that the output of above file is stopped, and total frames output and drawn should be recorded. Neither happens. The final "==== Recording Stop ========" line also does not show up

09:10:28.816: [game-capture: 'Mass Effect 2 LE'] ----------------- d3d11 capture freed ----------------
09:10:34.605: [game-capture: 'Mass Effect 2 LE'] capture window no longer exists, terminating capture
09:10:34.605: [game-capture: 'Mass Effect 2 LE'] capture stopped

This is after I Alt-Tabbed out of the game, closed it, and closed OBS (which was still "recording" with "Stopping..." button in blue background) window.

09:10:41.226: ==== Shutting down ==================================================
09:10:41.254: WASAPI: Device 'Kopfhörermikrofon (CORSAIR VOID PRO SURROUND USB Sound Adapter)' Terminated
09:10:41.260: WASAPI: Device 'Kopfhörer (CORSAIR VOID PRO SURROUND USB Sound Adapter)' Terminated
09:10:41.274: All scene data cleared
09:10:41.274: ------------------------------------------------

The rest is the result of killing the obs64.exe from task manager detail view.

I hope the lines that did not show up might tell someone what the heck is going on.

Yamakuzure commented 1 year ago

I found some sort of "workaround", at least in Mass Effect 1 Legendary Edition, I haven't tested this with ME2:LE or ME3:LE, yet.

When the recording does not start (Data written and Bitrate stay at 0), I press my hotkey to stop the recording, and the button in the OBS Window stays on "Stopping..." with blue background, I Quicksave the game and then Quickload. During the loading, when the scene is rebuilt, the "Stopping..." disappears and OBS is no longer frozen.

This could have been a coincidence, so I made some test, and I can reproduce this reliably. OBS stays frozen until I load a save game which would of course reset the engine and rebuild the scene. I have tested waiting for between 10 seconds and over 5 minutes to make sure this was no coincidence, but OBS always came back to life at the same point in loading the savegame.

I hope this gives some more hints about where to look...

Yamakuzure commented 1 year ago

Unfortunately the "Workaround" does not always work. Today I had one occasion where the Recording did not stop on game save reload. When I Alt-Tab'd out, "Stopping.." button kept being frozen. When I closes OBS Studio, it said this would stop all recordings, but obs64.exe and obs-ffmpeg-mux.exe were still in the task manager and had to be killed.

Unfortunately I did not remember to take a dump of the process first. If this happens again, I'll dump it and make it available for you.

Yamakuzure commented 1 year ago

I could reproduce the "workaround not working" full freeze.

The recording would not stop, 0 bytes output, and I tried the "reload save game fix" from above. But The recording was stuck like so often now.

So I tried to Alt-Tab out, this "fixed" the freeze a few times, too, but not this time.

Eventually I closed ME1:LE, closed OBS Studio and went into Taskmanager to find obs64.exe there still running with 86% GPU usage. (For what?)

Here is the log: 2023-01-18 07-59-17.txt Here are the dumps: obs-29.0.0_Full_freeze.7z

In the log you can see:

08:42:58.133: ==== Shutting down ==================================================
08:42:58.164: WASAPI: Device 'Kopfhörermikrofon (CORSAIR VOID PRO SURROUND USB Sound Adapter)' Terminated
08:42:58.170: WASAPI: Device 'Kopfhörer (CORSAIR VOID PRO SURROUND USB Sound Adapter)' Terminated
08:42:58.174: [game-capture: 'Mass Effect 1 LE'] capture stopped
08:42:58.184: All scene data cleared
08:42:58.184: ------------------------------------------------

<== Here ME1:LE and OBS Studio were already closed and I was taking the process dumps.
   Eventually I killed the obs64.exe process. ==>

08:43:29.506: [NVIDIA NVENC H.264 (FFmpeg) encoder: 'advanced_video_recording'] Encoding queue duration surpassed 5 seconds, terminating encoder
08:43:29.506: Error encoding with encoder 'advanced_video_recording'
08:43:29.512: [ffmpeg muxer: 'adv_file_output'] Output of file 'D:/Shadow/video/obs/2023-01-18_08-42-04.mkv' stopped
08:43:29.512: Output 'adv_file_output': stopping
Yamakuzure commented 1 year ago

This issue has cost me so much time already, that I went to something desperate: Full reset of all drivers and going back to OBS 27.1.3.

So I once again deleted all graphics drivers with DDU, but this time I re-installed the first driver packages for both Intel and nvidia graphics that Dell had, so I went back to nvidia drivers 451,67.

I then completely removed OBS, installed OBS 27.1.3 and deactivated "HAGS". OBS Studio shortcut is configured to start as Administrator.

As the nvidia drivers had to be fully wiped, I had to re-do my configuration: VSYNC Off, G-Sync activated, GPU limited to 145 FPS.

As those nvidia drivers are ancient in drivers world, I updated to the latest available pre-500 drivers: 474.14


So far I was able to record over 50 clips without any issues.

Next step is to update to the latest Dell-approved nvidia drivers, which will be 512.36. But these problems have already thrown me weeks behind schedule, so I will test newer drivers in a few days when I have caught up a bit.

Fenrirthviti commented 1 year ago

I've been following this, but haven't been able to reproduce any of the issues described, and it's sounding more and more like an environment or driver issue. I'm going to go ahead and close this for now, but if this starts happening again and can be isolated to specific reproduction steps, feel free to comment and we can reopen.

Yamakuzure commented 1 year ago

I've been following this, but haven't been able to reproduce any of the issues described, and it's sounding more and more like an environment or driver issue. I'm going to go ahead and close this for now, but if this starts happening again and can be isolated to specific reproduction steps, feel free to comment and we can reopen.

I am not convinced, yet, because I was able to record dozens of clips with OBS Studio 27.1.3 and nvidia drivers 517.66 (latest official drivers from Dell).

However, maybe you are right. I will update to latest OBS and nvidia drivers today, and if the issue is gone, it means that something got awry somewhere and was fixed by me completely removing everythin g and starting from scratch with the drivers.

I duly hope it is an environmental thing that got fixed with all the cleaning and reinstalls. But if the freezing comes back after the updates, then I am afraid I have to reopen this.

Edit: Sorry, but the first attempt to record something with OBS studio 29.0.1 went into a hard freeze, where I had to kill the process in task manager. I will roll back to OBS 27.1.3 and see whether it works again. If not, the nvidia driver is interfering. Otherwise this is a bug introduced in OBS 28. We'll see.

Yamakuzure commented 1 year ago

Edit: Sorry, but the first attempt to record something with OBS studio 29.0.1 went into a hard freeze, where I had to kill the process in task manager. I will roll back to OBS 27.1.3 and see whether it works again. If not, the nvidia driver is interfering. Otherwise this is a bug introduced in OBS 28. We'll see.

I have rolled back and was able to do plenty recordings using OBS Studio 27.1.3 with latest nvidia drivers 528.24

I am very sorry, I wanted this to be an issue with some environmental things, but this is clearly a bug introduced in OBS 27.2 as that's the earliest version I got the freezes with.

Please re-open and investigate. I have provided dumps and logs. If there is anything else I could provide, just give the word.

For the time being I am nailed on OBS 27.1.3.

Yamakuzure commented 1 year ago

@Fenrirthviti : I have proven that this is an actual bug. Could this please be opened? Bugs do not go away by closing their reports.

To clarify: I am super sorry that I was not able to pin this on some driver- or settings-issue on my machine. But I have also reported plenty of clues were to look (like the log comparison above and dumps), so I do hope this can be fixed some day.

OBS 27.1.3 with nvidia drivers 528.24 have been super stable. No issue for over 150 clips recorded so far. Upgrade to OBS 29.0.1: full freeze on any of the first 10 attempts to record something. Totally random, like before.

If I should attempt a shot in the dark, I'd say this is some dead-lock in a thread race.

Fenrirthviti commented 1 year ago

You have proven there is an issue on your system, but there is still no evidence I can find that this is a specific bug in OBS. This is most likely some kind of environment issues specific to your system. I can reopen this, but I've tested everything exactly as presented, with the driver versions given, and have not been able to replicate. If anyone else is able to test and confirm, that would be helpful for narrowing down what is going on.

As a note, I don't have the experience to dive in to crash dumps, so if anyone who would like to take a look at those and provide any insight that would also be welcome.

Yamakuzure commented 1 year ago

Thanks. I have the issue with all OBS versions 27.2.x and above, and not a single issue with 27.1.3. That does somewhat tackle the "issue on my system". OBS < 27.2 => No Issue OBS > 27.1 => Sporadic freezing

Now, what I do know, is, that the shipped ffmpeg version has been upgraded with OBS 27.2, which is one of the reasons why I am so desperate to get the higher versions working. So maybe. Just maybe, I know this is far fetched, it has something to do with my settings and how they are translated in OBS, like it was with issue #6062? Actually I am amazed that I did not think of this possibility earlier, no matter how far fetched.

So here are my video settings: (*)

06:52:32.044: video settings reset:
06:52:32.044:   base resolution:   2560x1440
06:52:32.044:   output resolution: 2560x1440
06:52:32.044:   downscale filter:  Lanczos
06:52:32.044:   fps:               120/1
06:52:32.044:   format:            I444
06:52:32.044:   YUV mode:          709/Full

Here are my encoder settings for OBS 27.1.3:

{
    "bf": 0,
    "bitrate": 96000,
    "cqp": 12,
    "lookahead": true,
    "max_bitrate": 256000,
    "preset": "mq",
    "psycho_aq": false,
    "rate_control": "CQP"
}

And finally those settings adapted to OBS 27.2+

{
    "bf": 0,
    "bitrate": 96000,
    "cqp": 12,
    "lookahead": true,
    "max_bitrate": 256000,
    "preset": "p5",
    "preset2": "p5",
    "psycho_aq": false,
    "rate_control": "CQP"
}

(*) Just to make sure this is not becoming an issue again, here another quote from the old issue about the settings mapping:

I have absolutely no problems with OBS 27.1.3 with these exact settings on this laptop. Neither with pre- nor post-500-series Nvidia GPU drivers.

Further, with OBS 27.1.3, I can record without any issues and with even higher settings. Questioning my settings and/or hardware does not help, I am afraid...

I do not want to sound rude, I just want to avoid wasting time with discussions about hardware and settings that worked fine for a 4-digit amount of recordings. I have reasons why I am recording the way I do record.

Edit: Of which 2,650 MKV files are still on my Backup drive.

pkviet commented 1 year ago

did you make sure that HAGS is off each time you tested with 29 ? with reinstalls it gets sometimes surreptitiously re-enabled. HAGS has been causing a lot of issues. You do note that hags was off with 27, just want to make sure it was off too with 29

Yamakuzure commented 1 year ago

did you make sure that HAGS is off each time you tested with 29 ? with reinstalls it gets sometimes surreptitiously re-enabled. HAGS has been causing a lot of issues. You do note that hags was off with 27, just want to make sure it was off too with 29

It was always on and never caused any troubles. As you can see in #6062 I tried turning it off and it caused my FPS to drop tremendously.

But the last time I cleaned everything with DDU and started from scratch, I turned HAGS off first, and this time I had no ill effects.

So I have tested both OBS-29.0.0 and OBS-29.0.1 with both HAGS off and on.

RytoEX commented 1 year ago

So maybe. Just maybe, I know this is far fetched, it has something to do with my settings and how they are translated in OBS, like it was with issue #6062?

The short answer is: no.


The long answer is: This may be possible, but perhaps not in the way you think. We do a one time NVENC settings _migration_ in 28.1.0+ (which was improved in 28.1.1) using recommendations from [NVIDIA's NVENC Preset Migration Guide](https://docs.nvidia.com/video-technologies/video-codec-sdk/nvenc-preset-migration-guide/index.html) to migrate things like "Performance", "Quality", and "Max Quality" to "P3", "P5", and "P5" respectively, with appropriate Tuning and Multipass parameters based on a best guess of common GPU generations and selected output resolutions in the wild. Once this migration has been performed, the user is free to fine tune and select any parameters they want, and the users chosen parameters will be used instead. This one-time migration does not occur again once a "preset2" value is present in your encoder settings. It is not a live translation that occurs on each launch of the app or encoder initialization. Separately, there is code in our NVENC implementation that checks during encoder initialization if "preset2" has been set, and if it has not been set, _then_ it will translate the settings in place to appropriate combinations of Preset, Tuning, and Multipass. However, once again, this does not occur if "preset2" is set. Further separately, FFmpeg itself will attempt to translate the old pre-SDK10 preset values to SDK10+ preset values if an older value is presented as the preset. Since we are passing new preset values as of OBS Studio 28.1, this does translation not occur. FFmpeg itself also uses the new preset values since FFmpeg 4.4. At this point, please note that [NVIDIA's NVENC Preset Migration Guide](https://docs.nvidia.com/video-technologies/video-codec-sdk/nvenc-preset-migration-guide/index.html) and various encoder throughput figures most likely assume YUV420/I420 or NV12 (see: https://developer.nvidia.com/blog/introducing-video-codec-sdk-10-presets/), not YUV444/I444. I444 is more data, and requires more work in OBS due to data conversion, and thus may result in higher load. Additionally, because you are using I444 instead of NV12 or P010, OBS is falling back to the FFmpeg NVENC implementation, which I do not believe uses texture sharing, so there will be additional system resource load. You can see that your system has some mild overload in [the log posted on Jan 7](https://github.com/obsproject/obs-studio/issues/8009#issuecomment-1374501478): ``` 15:30:12.827: Output 'adv_file_output': Number of lagged frames due to rendering lag/stalls: 18 (0.4%) 15:30:12.828: ==== Recording Stop ================================================ 15:30:12.829: Video stopped, number of skipped frames due to encoding lag: 33/4513 (0.7%) ``` All of that said and set aside, while "the SDK10+ presets are not 1:1 when compared to the pre-SDK10 presets" may be a contributing factor, it is likely not the root cause of either, "Clicking the 'Stop Recording' button sometimes does not stop my recording immediately," or, "My MKV files sometimes end up with 0 bytes in them."

Nothing stood out in the dumps.

As there is already an abundance of information in this Issue, please answer the following questions concisely to confirm some details:

  1. What is the total size of the destination drive you are writing to?
  2. What is the total amount of free space on the destination drive you are writing to?
  3. What is the type and write speed of the destination drive that you are writing to (external vs. internal, HDD vs. SSD, X MB/s write speed; if external, how is it connected; if HDD, RPM)?
  4. What is the first version of OBS where you observe the bad behavior?
  5. Have you attempted to leave OBS at "Stopping Recording..." for an extended period of time (5 minutes, 10 minutes, longer)?
  6. Are you running any other recording/capture/encoding software in the background (e.g., NVIDIA Shadowplay, Windows Game Bar)?
  7. Can you reproduce this on preset P1-P4?
  8. Could you please confirm whether or not you have lookahead enabled?

I have a vague idea of part of the problem, but I don't know enough about how it all pieces together to voice my thoughts at this time. We are looking into it.

Yamakuzure commented 1 year ago

As there is already an abundance of information in this Issue, please answer the following questions concisely to confirm some details:

  1. What is the total size of the destination drive you are writing to?

256GiB

  1. What is the total amount of free space on the destination drive you are writing to?

197GiB at the moment

  1. What is the type and write speed of the destination drive that you are writing to (external vs. internal, HDD vs. SSD, X MB/s write speed; if external, how is it connected; if HDD, RPM)?

Internal KXG60ZNV1T02 NVMe KIOXIA 1024GB (6 Gbps)

I made several tests using fio on Gentoo Linux under various workloads.

The worst result was:

  READ: bw=298MiB/s (312MB/s), 298MiB/s-298MiB/s (312MB/s-312MB/s), io=3070MiB (3219MB), run=10306-10306msec
  WRITE: bw=99.6MiB/s (104MB/s), 99.6MiB/s-99.6MiB/s (104MB/s-104MB/s), io=1026MiB (1076MB), run=10306-10306msec

and the best result was:

   READ: bw=792MiB/s (831MB/s), 792MiB/s-792MiB/s (831MB/s-831MB/s), io=3070MiB (3219MB), run=3875-3875msec
  WRITE: bw=265MiB/s (278MB/s), 265MiB/s-265MiB/s (278MB/s-278MB/s), io=1026MiB (1076MB), run=3875-3875msec
  1. What is the first version of OBS where you observe the bad behavior?

27.2.4 Due to the other issue(*) I did not try any other 27.2.x version before it.

There were some failed recordings with 1K files as far as I remember. The problem got out of hand with full freezes in November after updating to 28.1.

(*) Mentioned in the other issue, the ones about my settings and the new translation after the ffmpeg upgrade.

  1. Have you attempted to leave OBS at "Stopping Recording..." for an extended period of time (5 minutes, 10 minutes, longer)?

Yeah, I went for breakfast and came back half an hour later to a still "Stopping Recording..."

Interestingly enough tabbing out of the game I am recording sometimes, not always, ends the stopping. The other times I have to kill obs and ffmpeg using task manager.

  1. Are you running any other recording/capture/encoding software in the background (e.g., NVIDIA Shadowplay, Windows Game Bar)?

NVIDIA Shadowplay is deactivated. Windows Game Bar is not installed. Origin/EA overlays are deactivated.

  1. Can you reproduce this on preset P1-P4?

I never tried that, but will try this out tomorrow.

  1. Could you please confirm whether or not you have lookahead enabled?

I have it currently enabled.

One odd thing: I never had it enabled, but after updating to 27.2 with the new ffmpeg version, I had to turn it on, or I would loose the first few seconds of the video due to "encoding lag".

Now that I have rolled back to 27.1.3 I did not bother to turn it off, as you can see above, and have the first few seconds of encoding lag with it turned on... I wonder whether they go away if I turn lookahead off again...

I have a vague idea of part of the problem, but I don't know enough about how it all pieces together to voice my thoughts at this time. We are looking into it.

Thank you very very much!

RytoEX commented 1 year ago

27.2.4 Due to the other issue(*) I did not try any other 27.2.x version before it.

I think this is the first I've heard of this specific Issue occurring in 27.2.x. Do you have any logs of this Issue occurring in 27.2.x?

The other issue (presumably #6062) was about FFmpeg's NVENC preset translation. It is not about encoder failures. This Issue, as far as I can tell, is about an encoder failure of some kind. Let us please keep this Issue scoped to its specific problem.

Yamakuzure commented 1 year ago

Do you have any logs of this Issue occurring in 27.2.x?

No. I would have to install that version first. I can try tomorrow. Here is the today's experiments:


recordEncoder.json after upgrading to and start of obs-29.0.2:

{
    "bf": 2,
    "bitrate": 96000,
    "cqp": 12,
    "lookahead": true,
    "max_bitrate": 256000,
    "multipass": "qres",
    "preset": "mq",
    "preset2": "p5",
    "psycho_aq": false,
    "rate_control": "CQP",
    "tune": "hq"
}

13 recordings so far in Mass Effect 2 Legendary Edition without any issues. :smile:

Updated the settings via File->Settings->Output :

Switched Multipass Mode from "Two Passes (Quarter Resolution)" to "Single Pass" Turned off Look-ahead. Lowered Max B-frames from 2 to 0.

resulting recordEncoder.json:

{
    "bf": 0,
    "bitrate": 96000,
    "cqp": 12,
    "lookahead": false,
    "max_bitrate": 256000,
    "multipass": "disabled",
    "preset": "mq",
    "preset2": "p5",
    "psycho_aq": false,
    "rate_control": "CQP",
    "tune": "hq"
}

4 recordings with these settings, then the next full "stopping..." freeze came. (Waited 5 minutes, did not disappear) After closing OBS Studio, I had to kill obs64.exe in task manager. (The fifths recording did not show up as a 1K file this time.)

(Log: https://obsproject.com/logs/EpEQzhN4pHkB5Sup)

08:40:56.010: ==== Recording Start ===============================================
08:40:56.010: [ffmpeg muxer: 'adv_file_output'] Writing file 'D:/Shadow/video/obs/2023-02-16_08-40-55.mkv'...
08:41:14.016: Stopping recording due to hotkey
08:46:28.911: [game-capture: 'Mass Effect 2 LE'] ----------------- d3d11 capture freed ----------------

Direct Copy&Paste, I did not delete anything between line 3 and 4, there simply was nothing.

Went back to top configuration to see whether the 13 were just luck... (CPU usage did not go down significantly, so I think I can keep lookahead and multipass.)


Only one recording this time, froze on second recording.

(Log: https://obsproject.com/logs/TOOGfOiZAunZDzZu)

Maybe something is stuck or jammed when ending the first session? I will reboot and try a complete fresh login.

... unfortunately I got a call then and my time was up.

I will try again tomorrow, and will then generate the 27.2.4 log you asked for.

RytoEX commented 1 year ago

The most recent logs notably, lack this line:

[NVIDIA NVENC H.264 (FFmpeg) encoder: 'advanced_video_recording'] Encoding queue duration surpassed 5 seconds, terminating encoder

This is perplexing, as all other cases I've observed had this. If you're getting encoder freezes/failures and that line is not present, then I'm afraid my lead is gone.

Yamakuzure commented 1 year ago

... unfortunately I got a call then and my time was up.

I will try again tomorrow, and will then generate the 27.2.4 log you asked for.

No, the 13 were ust luck. Second clip froze this morning.

This time I killed the ffmpeg process (after closing OBS via close icon). My idea was, that ffmpeg might hang and simply blocks OBS, but it did not. obs64.exe is still there.

If you're getting encoder freezes/failures and that line is not present, then I'm afraid my lead is gone.

Maybe its a new symptom in OBS 29.0.2? I will now deinstall OBS 29.0.2 and then install 27.2.4, maybe its log is more telling if I can reproduce this.

Yamakuzure commented 1 year ago

Started 27.2.4, removed lookahead and b-frames again and deactivated multipass in the settings.json:

{
    "bf": 0,
    "bitrate": 96000,
    "cqp": 12,
    "lookahead": false,
    "max_bitrate": 256000,
    "multipass": "single",
    "preset": "p5",
    "psycho_aq": false,
    "rate_control": "CQP",
    "tune": "hq"
}

(Although I am pretty sure 27.2.4 did not have a "multipass" setting.)

First attempt froze: https://obsproject.com/logs/zimeg5x27HHGulU7

Yamakuzure commented 1 year ago

I can not reproduce the issue on Gentoo Linux with obs-studio-29.0.2 built with gcc 12.2.1 using Qt-5.15.8, nvidia-drivers-525.85.05 and ffmpeg-4.4.3

Configuration printed: Found FFmpeg: /usr/lib64/../lib64/libavcodec.so (found version "58.134.100") found components: avcodec avfilter avdevice avutil swscale avformat swresample

What version does OBS Studio for Windows ship exactly? All I could find was:

 $ strings ./obs-plugins/64bit/obs-ffmpeg.dll | grep av | grep dll
avcodec-59.dll
avformat-59.dll
avdevice-59.dll
avutil-57.dll

That is either ffmpeg 5.0 or ffmpeg 5.1 ... which one is it? I only found that FFmpeg REQUIRED in obs-ffmpeg CMakeLists.txt, but no indication whether it is 5.0 or 5.1.

I could update to ffmpeg 5.1 on my Gentoo and see whether the issue can be reproduced there. But I would have to rebuild some really heavy packages like QtWebEngine, so I'd rather not do it if OBS is built against ffmpeg 5.0.

RytoEX commented 1 year ago

No, the 13 were ust luck. Second clip froze this morning.

I don't know what this means. Could you please clarify what version froze?

Maybe its a new symptom in OBS 29.0.2?

Unlikely. The specific code for that log line was added in OBS Studio 28.

What version does OBS Studio for Windows ship exactly? All I could find was:

FFmpeg 4.4.1 (plus some cherry-picked commits and patches) for OBS Studio 27.2. FFmpeg 5.0.1 for OBS Studio 28.0 and 28.1. FFmpeg 5.1.2 for OBS Studio 29.0. The DLL properties would give not only the FFmpeg versions but also the exact individual library versions.

I will add that there is a similar report of this occurring on Fedora with OBS Studio 28 (#7534), though I would ask that you please stick to this GitHub Issue as we consider how or whether they are the same or just related. My previous theory was that perhaps commit https://github.com/obsproject/obs-studio/commit/898256d41620878367bc40a2dd66d236e892c99f played a part in this, or that some other 28-specific changes may be involved (the FFmpeg encoders were refactored). I am still not 100% convinced that the encoder failure you're seeing in 27.2.x is the same encoder failure in 28.x+ - you may simply be running into multiple different types of encoder failure.

I will also add that so far, I have only seen these issues with FFmpeg NVENC, which in your case, is being used because of the specific settings you have selected.

If you would like to provide additional debugging information, you could build OBS with these lines enabled (using #if 1 or removing #if 0 and #endif), and then waiting for a failure to occur. Please note that the log will be very spammy.

Yamakuzure commented 1 year ago

No, the 13 were ust luck. Second clip froze this morning.

I don't know what this means. Could you please clarify what version froze?

Version 29.0.2 - the version I did 13 recordings which that went just fine.

What version does OBS Studio for Windows ship exactly? All I could find was:

FFmpeg 4.4.1 (plus some cherry-picked commits and patches) for OBS Studio 27.2. FFmpeg 5.0.1 for OBS Studio 28.0 and 28.1. FFmpeg 5.1.2 for OBS Studio 29.0. The DLL properties would give not only the FFmpeg versions but also the exact individual library versions.

Ah, okay. I was on Linux and only did some inspection of the content of the full package zip and the source archive.

Before I answer the rest, here is the todays experiments results. (I am still hoping that this is "just" some weird accumulation of issues on my system, and that the "freezing" in obs is just a symptom...)


Went back to 27.1.3 and after a reboot I recorded over 50 clips without any problems. But it got me thinking: I can not reproduce this "freezing" on Linux, neither with ffmpeg 4.4.3, nor 5.1.2.

So there must be something happening with all the updates, downgrades, uninstalls and reinstalls on Windows. Or it is a Windows-only thing, but I am much more willing to believe that some left-over "cruft" on my Windows installation causes this.

So let's be more thorough this time:

  1. Uninstall obs-27.1.3 via uninstaller, this time including all settings, scenes, etc. (I always left that box unchecked.)
  2. Issue $ find /cygdrive/{c,p}/ -iname '*obs-*' -or -iname '*ffmpeg*' 2>/dev/null in Cygwin64 terminal.
  3. Use ccleaner (Yes, I know...) to scan registry for issues. Some orphans, nothing sticks out.
  4. find ended. Interesting how often ffmpeg.dll appeared. Only two things stood out: "Shark007" and "P:\ffmpeg". I don't need either so uninstall/remove them.
  5. Reboot. (It is Windows.)
  6. Install OBS Stuidio 29.0.2 and set desktop shortcut to "Start as Administrator". (Windows...)
  7. Reboot again. (Windows!!!)
  8. Set up OBS Studio from scratch.

Erm.. the default is to disable look-ahead, and to enable Psycho Visual Tuning? Really?

The resulting recordEncoder.json astonishes me:

{
    "bf": 0,
    "cqp": 12,
    "psycho_aq": false,
    "rate_control": "CQP"
}

Looks like it only stores what is not set to default. Which leads to the question what else is hard-coded that I couldn't've taken into account, yet?

First Tryout

This is weird. The settings when I had absolutely no encoding lag in 29.0.2 versus now:

[FFmpeg NVENC encoder: 'advanced_video_recording'] settings:
OLD                         NEW
NVIDIA NVENC H.264 (FFmpeg) NVIDIA NVENC H.264 (FFmpeg)
rate_control: CQP           CQP
bitrate:      0             0
cqp:          12            12
keyint:       250           250
preset:       p5            p5
tuning:       hq            hq
multipass:    qres          qres
profile:      high          high
width:        2560          2560
height:       1440          1440
b-frames:     2             0
psycho-aq:    0             0
GPU:          0             0

Note: DXGI_SWAP_CHAIN_DESC is also the same.

Alright... I am desperate, so I now set bf to 2, disabled multipass and enabled both lookahead and psycho_aq. resulting config:

{
    "bf": 2,
    "cqp": 12,
    "lookahead": true,
    "multipass": "disabled",
    "psycho_aq": true,
    "rate_control": "CQP"
}

Second Tryout

That did the trick, 0.0% encoding lag again.

Unfortunately, after 11 recordings went smooth and fine, the 12th froze again. But the recording eventually stopped by itself when I exited the game, so it left a 1K mkv behind, and here is the log snippet, commented by me:

09:53:43.067: ==== Recording Start ===============================================
09:53:43.067: [ffmpeg muxer: 'adv_file_output'] Writing file 'D:/Shadow/video/obs/2023-02-18_09-53-42.mkv'...
  === I saw that no output was generated ===
09:54:21.388: Stopping recording due to hotkey
  === Exited the game when button stayed on "Stopping..." ===
09:54:33.706: [game-capture: 'LE2'] ----------------- d3d11 capture freed ----------------
09:54:33.763: [game-capture: 'LE2'] capture window no longer exists, terminating capture
09:54:33.763: [game-capture: 'LE2'] capture stopped
09:54:39.457: [ffmpeg muxer: 'adv_file_output'] Output of file 'D:/Shadow/video/obs/2023-02-18_09-53-42.mkv' stopped
09:54:39.457: Output 'adv_file_output': stopping
 === The frames have been seen and counted, they just did not get recorded. ===
09:54:39.457: Output 'adv_file_output': Total frames output: 1
09:54:39.457: Output 'adv_file_output': Total drawn frames: 6743 (6767 attempted)
09:54:39.457: Output 'adv_file_output': Number of lagged frames due to rendering lag/stalls: 24 (0.4%)
09:54:39.457: ==== Recording Stop ================================================

At least I did not have to kill it and there is something written to the log this time. Here is the full log: https://obsproject.com/logs/1MaQU0makUrUaC27 Here is the 1K MKV : https://mega.nz/file/3NBEQLKQ#dYvyG8TkHsgqX3abPMMfUYF7ROSs4qK93LG0Hbrqw6I


I will add that there is a similar report of this occurring on Fedora with OBS Studio 28 (#7534), though I would ask that you please stick to this GitHub Issue as we consider how or whether they are the same or just related. My previous theory was that perhaps commit 898256d played a part in this, or that some other 28-specific changes may be involved (the FFmpeg encoders were refactored). I am still not 100% convinced that the encoder failure you're seeing in 27.2.x is the same encoder failure in 28.x+ - you may simply be running into multiple different types of encoder failure.

The more I think about it, the more I agree.

I will also add that so far, I have only seen these issues with FFmpeg NVENC, which in your case, is being used because of the specific settings you have selected.

Yes, jim-nvenc is limited to NV12 color format, and 4:2:0 with 2 planes do not cut it for me. It may be fine for streaming, but not for the post processing I do. Maybe "jim" could add more color formats? At least I420 and I444 for a start?

If you would like to provide additional debugging information, you could build OBS with these lines enabled (using #if 1 or removing #if 0 and #endif), and then waiting for a failure to occur. Please note that the log will be very spammy.

Thank you very much for the Link, I will try to get it built on my dev VM. (Gaming dual-boot has no dev tools, but I have a VM for that.)

Please note that the log will be very spammy.

Oh, don't worry. Some of my tools I built at work spill out tons of lines, like one for every *alloc/free, when run in debug mode. As long as I can grep or regex search, I'll be fine. 😉

Yamakuzure commented 1 year ago

My previous theory was that perhaps commit 898256d played a part in this, or that some other 28-specific changes may be involved (the FFmpeg encoders were refactored).

After looking over the commit briefly, several things came to my mind... Maybe I should try to make a tsan build on Linux first. That's how I check my programs which multi-thread. Also, with such bufferings, an lsan build might be worth the hassle...

Yamakuzure commented 1 year ago

I finally had the freeze on Linux. OBS is just sitting there with "Stopping Recording..."

I have compiled the whole thing with -fsanitize=thread and tsan had a field day already on startup.

Anyway, I hope I can hook in a gdb session to exactly see where we are.

Yamakuzure commented 1 year ago

Some first impression information: (Without any in-depth knowledge about what they mean, yet)

It does not look like the Hotkey even resulted in an attempt to stop the recording.

But ThreadSanitizer printed out a lot of warnings about data races. Almost all seem to be normal GUI threading stuff, you know where it is not important whether a variable changes while another thread reads it, but one stuck out:

WARNING: ThreadSanitizer: heap-use-after-free (pid=27156)
  Read of size 1 at 0x7b180008d4d8 by thread T13 (mutexes: write M964, write M3153, write M198293812513464144):
    #0 memcpy /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 (libtsan.so.2+0x61e50)
    #1 memcpy /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:819 (libtsan.so.2+0x61e50)
    #2 bmemdup /home/sed/pryde/obs-studio/libobs/util/bmem.c:160 (libobs.so.0+0x1165b7)
    #3 dstr_ncopy /home/sed/pryde/obs-studio/libobs/util/dstr.c:381 (libobs.so.0+0x1214bd)
    #4 <null> <null> (linux-capture.so+0xae55)
    #5 <null> <null> (linux-capture.so+0xbadd)
    #6 <null> <null> (linux-capture.so+0xcd49)
    #7 obs_source_video_tick /home/sed/pryde/obs-studio/libobs/obs-source.c:1304 (libobs.so.0+0xaabe4)
    #8 tick_sources /home/sed/pryde/obs-studio/libobs/obs-video.c:68 (libobs.so.0+0xc7298)
    #9 obs_graphics_thread_loop /home/sed/pryde/obs-studio/libobs/obs-video.c:1112 (libobs.so.0+0xcac20)
    #10 obs_graphics_thread /home/sed/pryde/obs-studio/libobs/obs-video.c:1194 (libobs.so.0+0xcb239)

  Previous write of size 8 at 0x7b180008d4d8 by thread T13 (mutexes: write M0, write M3153):
    #0 free /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:706 (libtsan.so.2+0x496ea)
    #1 a_free /home/sed/pryde/obs-studio/libobs/util/bmem.c:84 (libobs.so.0+0x116396)
    #2 bfree /home/sed/pryde/obs-studio/libobs/util/bmem.c:142 (libobs.so.0+0x116520)
    #3 darray_ensure_capacity /home/sed/pryde/obs-studio/libobs/util/darray.h:118 (libobs.so.0+0x128216)
    #4 darray_push_back /home/sed/pryde/obs-studio/libobs/util/darray.h:196 (libobs.so.0+0x1284ac)
    #5 profile_start /home/sed/pryde/obs-studio/libobs/util/profiler.c:384 (libobs.so.0+0x129c5f)
    #6 output_frame /home/sed/pryde/obs-studio/libobs/obs-video.c:878 (libobs.so.0+0xca24e)
    #7 output_frames /home/sed/pryde/obs-studio/libobs/obs-video.c:906 (libobs.so.0+0xca506)
    #8 obs_graphics_thread_loop /home/sed/pryde/obs-studio/libobs/obs-video.c:1124 (libobs.so.0+0xcac77)
    #9 obs_graphics_thread /home/sed/pryde/obs-studio/libobs/obs-video.c:1194 (libobs.so.0+0xcb239)

  As if synchronized via sleep:
    #0 nanosleep /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:358 (libtsan.so.2+0x668d0)
    #1 os_sleepto_ns /home/sed/pryde/obs-studio/libobs/util/platform-nix.c:231 (libobs.so.0+0x137753)
    #2 video_sleep /home/sed/pryde/obs-studio/libobs/obs-video.c:808 (libobs.so.0+0xc9d32)
    #3 obs_graphics_thread_loop /home/sed/pryde/obs-studio/libobs/obs-video.c:1139 (libobs.so.0+0xcad72)
    #4 obs_graphics_thread /home/sed/pryde/obs-studio/libobs/obs-video.c:1194 (libobs.so.0+0xcb239)

  Location is heap block of size 88 at 0x7b180008d480 allocated by thread T13:
    #0 malloc /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:647 (libtsan.so.2+0x3ef36)
    #1 read_packet /data/portage/portage/x11-libs/libxcb-1.15-r1/work/libxcb-1.15/src/xcb_in.c:265 (libxcb.so.1+0xf872)
    #2 _xcb_in_read /data/portage/portage/x11-libs/libxcb-1.15-r1/work/libxcb-1.15/src/xcb_in.c:1042 (libxcb.so.1+0xf872)
    #3 <null> <null> (linux-capture.so+0xae18)
    #4 <null> <null> (linux-capture.so+0xbadd)
    #5 <null> <null> (linux-capture.so+0xcd49)
    #6 obs_source_video_tick /home/sed/pryde/obs-studio/libobs/obs-source.c:1304 (libobs.so.0+0xaabe4)
    #7 tick_sources /home/sed/pryde/obs-studio/libobs/obs-video.c:68 (libobs.so.0+0xc7298)
    #8 obs_graphics_thread_loop /home/sed/pryde/obs-studio/libobs/obs-video.c:1112 (libobs.so.0+0xcac20)
    #9 obs_graphics_thread /home/sed/pryde/obs-studio/libobs/obs-video.c:1194 (libobs.so.0+0xcb239)

  Mutex M964 (0x7b9000024428) created at:
    #0 pthread_mutex_init /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:1295 (libtsan.so.2+0x52dc2)
    #1 pthread_mutex_init_recursive /home/sed/pryde/obs-studio/libobs/util/threading.h:61 (libobs.so.0+0x324ee)
    #2 obs_init_data /home/sed/pryde/obs-studio/libobs/obs.c:898 (libobs.so.0+0x36eb1)
    #3 obs_init /home/sed/pryde/obs-studio/libobs/obs.c:1131 (libobs.so.0+0x37c81)
    #4 obs_startup /home/sed/pryde/obs-studio/libobs/obs.c:1218 (libobs.so.0+0x3810e)
    #5 StartupOBS /home/sed/pryde/obs-studio/UI/obs-app.cpp:1572 (obs+0xa4968)
    #6 OBSApp::OBSInit() /home/sed/pryde/obs-studio/UI/obs-app.cpp:1659 (obs+0xa50e3)
    #7 run_program /home/sed/pryde/obs-studio/UI/obs-app.cpp:2466 (obs+0xa8946)
    #8 main /home/sed/pryde/obs-studio/UI/obs-app.cpp:3358 (obs+0xac191)

  Mutex M3153 (0x7b700000b770) created at:
    #0 pthread_mutex_init /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:1295 (libtsan.so.2+0x52dc2)
    #1 graphics_init /home/sed/pryde/obs-studio/libobs/graphics/graphics.c:161 (libobs.so.0+0xd4b7a)
    #2 gs_create /home/sed/pryde/obs-studio/libobs/graphics/graphics.c:207 (libobs.so.0+0xd4e84)
    #3 obs_init_graphics /home/sed/pryde/obs-studio/libobs/obs.c:466 (libobs.so.0+0x34fb7)
    #4 obs_reset_video /home/sed/pryde/obs-studio/libobs/obs.c:1408 (libobs.so.0+0x3931b)
    #5 AttemptToResetVideo /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:4321 (obs+0x1fd283)
    #6 OBSBasic::ResetVideo() /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:4444 (obs+0x1fdbb5)
    #7 OBSBasic::OBSInit() /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:1761 (obs+0x1ea087)
    #8 OBSApp::OBSInit() /home/sed/pryde/obs-studio/UI/obs-app.cpp:1698 (obs+0xa5293)
    #9 run_program /home/sed/pryde/obs-studio/UI/obs-app.cpp:2466 (obs+0xa8946)
    #10 main /home/sed/pryde/obs-studio/UI/obs-app.cpp:3358 (obs+0xac191)

  Mutex M198293812513464144 is already destroyed.

  Mutex M0 (0x7b9000024290) created at:
    #0 pthread_mutex_init /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:1295 (libtsan.so.2+0x52dc2)
    #1 obs_init_video /home/sed/pryde/obs-studio/libobs/obs.c:656 (libobs.so.0+0x35cfb)
    #2 obs_reset_video /home/sed/pryde/obs-studio/libobs/obs.c:1456 (libobs.so.0+0x3959e)
    #3 AttemptToResetVideo /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:4321 (obs+0x1fd283)
    #4 OBSBasic::ResetVideo() /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:4444 (obs+0x1fdbb5)
    #5 OBSBasic::OBSInit() /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:1761 (obs+0x1ea087)
    #6 OBSApp::OBSInit() /home/sed/pryde/obs-studio/UI/obs-app.cpp:1698 (obs+0xa5293)
    #7 run_program /home/sed/pryde/obs-studio/UI/obs-app.cpp:2466 (obs+0xa8946)
    #8 main /home/sed/pryde/obs-studio/UI/obs-app.cpp:3358 (obs+0xac191)

  Thread T13 'libobs: graphic' (tid=29943, running) created by main thread at:
    #0 pthread_create /data/portage/portage/sys-devel/gcc-12.2.1_p20230121-r1/work/gcc-12-20230121/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x62895)
    #1 obs_init_video /home/sed/pryde/obs-studio/libobs/obs.c:667 (libobs.so.0+0x35d7f)
    #2 obs_reset_video /home/sed/pryde/obs-studio/libobs/obs.c:1456 (libobs.so.0+0x3959e)
    #3 AttemptToResetVideo /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:4321 (obs+0x1fd283)
    #4 OBSBasic::ResetVideo() /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:4444 (obs+0x1fdbb5)
    #5 OBSBasic::OBSInit() /home/sed/pryde/obs-studio/UI/window-basic-main.cpp:1761 (obs+0x1ea087)
    #6 OBSApp::OBSInit() /home/sed/pryde/obs-studio/UI/obs-app.cpp:1698 (obs+0xa5293)
    #7 run_program /home/sed/pryde/obs-studio/UI/obs-app.cpp:2466 (obs+0xa8946)
    #8 main /home/sed/pryde/obs-studio/UI/obs-app.cpp:3358 (obs+0xac191)

The other might just be Qt related and more or less harmless, but I'll go through them nevertheless.

Yamakuzure commented 1 year ago

I am learning my way around the sources while fixing various issues. Most are harmless but still bugs, so might be worth it. Some are more serious, like heap use after free. (My work can be seen here: https://github.com/Yamakuzure/obs-studio/commits/fix_multithreading) (*)

However, I got a new clue today.

OBS Studio 27.1.3 also got messed up lately, since EA forced all Origin users, including me, to switch to the abysmally buggy "EA Desktop App". This app is very notoriously hogging CPU and GPU making recording almost impossible whenever it does anything in the background. (**)

The behavior is different, like, for example, the recording starts and runs for a few seconds, and then halts. When I stop it, I get the "Stopping..." button that never goes away on its own, but it is normally enough to Quickload the last Save, which would re-init the game engine, to make OBS finish the stopping. The video that comes out of this is actually the few seconds long the recording ran. This is with HAGS turned ON.

With HAGS turned off I do not only get the freezes I have never seen on that version before, but also a very odd thing that I can not really explain:

Sometimes the recording does start, but the FPS in the stats Window get displayed in red and are somewhere between 50 and 90. The game itself is more like 0.5 FPS. Ending the recording takes a while, but the moment OBS stops recording, FPS go back up to solid 145 ingame and fixed 120 in the stats window.

My best bet right now is, that the EA desktop App is "stealing" GPU memory in the background, and the little OBS adds, is then too much for my T2000. So I wondered, whether the GPU memory consumption between OBS 27.1 and 29.0 may have been risen, and whether that triggered the issue I have experienced?

And to make things worse, Windows started greeting me with full screen "ads" about upgrading to Windows 11.

(*) Working in the sources feels odd, like a trip back to the 1990s. Hard break on 80 characters? Really? My tty on my laptop has a width of 236 characters already. Limiting oneself to 80 characters does not help anywhere with todays displays. It only causes odd line breaks that are harder to read. (But that's just my 2 cents.)

(**) Although meant to fix problems with Shadowplay, the pinned comment here: https://www.youtube.com/watch?v=DwxdASZz5is has a workaround to make the EA desktop App play nicely with recording. Might be helpful with OBS, too? I'll test it this week-end. (***)

Also I found this post: https://www.reddit.com/r/linux_gaming/comments/q98x0u/disable_origin_client_hardware_acceleration/

So it looks like I never had problems with HAGS enabled, because it caused origin.exe onto the Intel HD GPU. And when I deactivated HAGS the first time, nvidia drivers scheduled it onto the nvidia card? This is something I have to check, too.

(***) I was so curious, that I tried it out. First result: HAGS off and lookahead on results in 80%+ frame loss due to encoding lag. (Reminds me of something...) Disabled lookahead and put OBS onto Intel HD: Perfect recording, no issues, but audio only, so...

Re-enabled HAGS and made sure that EA Stuff is kept away from my Nvidia Quadro. But time is up for today, so tests will come tomorrow.

RytoEX commented 1 year ago

There's a lot here. I'm going to respond to the items that seem relevant to this Issue.

So there must be something happening with all the updates, downgrades, uninstalls and reinstalls on Windows. Or it is a Windows-only thing, but I am much more willing to believe that some left-over "cruft" on my Windows installation causes this.

I'm not inclined to agree with that assessment at this time.

OBS Studio 27.1.3 also got messed up lately, since EA forced all Origin users, including me, to switch to the abysmally buggy "EA Desktop App". This app is very notoriously hogging CPU and GPU making recording almost impossible whenever it does anything in the background. (**)

To me, this is starting to sound like nothing in one particular version in OBS causes one distinct issue.

Interestingly enough tabbing out of the game I am recording sometimes, not always, ends the stopping.

The behavior is different, like, for example, the recording starts and runs for a few seconds, and then halts. When I stop it, I get the "Stopping..." button that never goes away on its own, but it is normally enough to Quickload the last Save, which would re-init the game engine, to make OBS finish the stopping. The video that comes out of this is actually the few seconds long the recording ran.

This almost sounds like the GPU is freeing up resources during a quick load, which causes the encoder to finally dump its backlogged data. This is how things are supposed to work, as far as I understand.

My best bet right now is, that the EA desktop App is "stealing" GPU memory in the background, and the little OBS adds, is then too much for my T2000. So I wondered, whether the GPU memory consumption between OBS 27.1 and 29.0 may have been risen, and whether that triggered the issue I have experienced?

With what information I have available, I'd be inclined to agree that you're hitting some kind of resource limit, causing the encoder to get behind and then get stuck or fail. What does GPU-Z say about the GPU's total memory? What does it say about load and memory usage when you encounter one of these problem sessions? It should be easy to verify the GPU memory usage of different versions of OBS.

() Although meant to fix problems with Shadowplay, the pinned comment here: https://www.youtube.com/watch?v=DwxdASZz5is has a workaround to make the EA desktop App play nicely with recording. Might be helpful with OBS, too? I'll test it this week-end. (*)

I don't see anything in this video that is something that affects this scenario. They are talking about Shadowplay hooking EA App as a game. OBS would not do that, unless it's a Vulkan app. Even then, I do not believe it would contribute to a potential encoder failure.

Also I found this post: https://www.reddit.com/r/linux_gaming/comments/q98x0u/disable_origin_client_hardware_acceleration/

Sure, disabling hardware acceleration for other apps may free up GPU resources (in the specific example in the post, they seem to get back about 300MB of GPU RAM). You may be forcing the app to do software rendering, which may be less performant, or maybe your CPU has enough headroom to accommodate this, or maybe the app changes its behavior if it does not have access to hardware acceleration.

All of this is starting to sound like a system resource usage issue causing an encoder to fail or get behind (the frames are coming out of the encoder slower than we're putting them in).

So it looks like I never had problems with HAGS enabled, because it caused origin.exe onto the Intel HD GPU. And when I deactivated HAGS the first time, nvidia drivers scheduled it onto the nvidia card?

Intel GPUs do not support HAGS. HAGS should not necessarily decide which GPU an app runs on. It just makes the GPU-based scheduling processor handle GPU task scheduling instead of the CPU. That origin.exe switched GPUs is odd, but should not be based on the HAGS setting. The one point of interest here is that perhaps, because origin.exe was not on your NVIDIA GPU, the NVIDIA GPU had more headroom (either in GPU Load or in Memory Used/Available), which would still point to a resource constraint issue.


Again, let us please focus on the important points of this Issue and not get lost in the weeds on what workarounds people recommend for avoiding conflicts between EA App and other software, or the OBS coding style, or Windows 11 ads. This Issue, while important to us, is already extremely long, and I find that such Issues are less approachable and off-putting to people who would otherwise be interested in resolving the problem. Thank you for your understanding.

Yamakuzure commented 1 year ago

What does GPU-Z say about the GPU's total memory? What does it say about load and memory usage when you encounter one of these problem sessions? It should be easy to verify the GPU memory usage of different versions of OBS.

nvidia-smi stats on idle before starting anything:

Driver Version                            : 528.24
CUDA Version                              : 12.0

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Product Name                          : Quadro T2000
    Product Brand                         : Quadro RTX
    Product Architecture                  : Turing
(...)
    FB Memory Usage
        Total                             : 4096 MiB
        Reserved                          : 147 MiB
        Used                              : 399 MiB
        Free                              : 3549 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 2 MiB
        Free                              : 254 MiB
    Compute Mode                          : Default
(...)

I have logged with GPU-Z, and the values say, that the maximum memory consumption was 3,913 MB.

While this happened, I recorded a video with obs-27.1.3 with up to 368,000 kbit/s.

But I also went through some scenes I had this stuttering, and they now worked fine after I set the wretched EA desktop App to be locked on the Intel UHD. So this was completely unrelated and not a clue at all. Recording with 27.1.3 also works fine again, so this was all useless noise thanks to that EA garbage. :-(

This Issue, while important to us, is already extremely long, and I find that such Issues are less approachable and off-putting to people who would otherwise be interested in resolving the problem. Thank you for your understanding.

I got carried away a bit, sorry.

Meanwhile I have finished with the Sanitizers. If I can no longer reproduce the freezing on Linux, I will build it for Windows and see what happens there.

Yamakuzure commented 1 year ago

Finally a breakthrough!

I caught obs-ffmpeg-mux here:

(...)
#5 safe_read ffmpeg-mux.c:638                    size_t in_size = fread(data, 1, size, stdin);
#6 ffmpeg_mux_get_header ffmpeg-mux.c:653        bool success = safe_read(&info, sizeof(info)) == sizeof(info);
#7 ffmpeg_mux_get_extra_data ffmpeg-mux.c:672    if (!ffmpeg_mux_get_header(ffm))
#8 ffmpeg_mux_init_internal ffmpeg-mux.c:1104    if (!ffmpeg_mux_get_extra_data(ffm))
#9 ffmpeg_mux_init ffmpeg-mux.c:1116             ret = ffmpeg_mux_init_internal(ffm, argc, argv);
#10 main ffmpeg-mux.c:1283                       ret = ffmpeg_mux_init(&ffm, argc, argv);
(...)

The source in question is:

637 while (size > 0) {
638     size_t in_size = fread(data, 1, size, stdin);
639     if (in_size == 0)
640         return 0;
641
642     size -= in_size;
643     data += in_size;
644 }

So fread() basically hangs on reading stdin endlessly, because the header wanted is not sent. This also explains why the recording can not be stopped: fread() will always wait until hell freezes.

It looks like both ffmpeg_hls_mux_data() and ffmpeg_mux_data() deactivate the stream (aka ffmpeg muxer and pipe) if no packets are sent, assuming an encoder error. Unfortunately they do so without checking whether the muxer got its headers already. If it didn't, it'll be stuck forever waiting for it. (*)

I'll test a possible fix for that one tonight.

Edit : It looks like there is more to it. Still investigating why the muxer does not get its headers in certain circumstances.

Yamakuzure commented 1 year ago

UPDATE Rebased my branch to 29.1.0-beta4 (https://github.com/Yamakuzure/obs-studio/commit/194424dc721bad7bd8a07178fee36f76e94fc162)


Here is an update from my side:

I have just recorded 68 Clips of varying lengths without any freeze. I finally fixed the issue. Or so it would seem, as my previous "best" was 12 clips before obs froze.

( Update : After the rebase on 29.1.0-beta4 I successfully recorded 46 clips with obs built in release mode. )

I believe the two relevant commits to look at were:

  1. The first is split into three parts:

  2. https://github.com/Yamakuzure/obs-studio/commit/737fdb5863ea2fac9e9d351fb35e607d63de9e73 "Add exit packet to ffmpeg muxer to signal deactivation request" After this commit, I could no longer trigger the end-of-recording freeze.

I changed a lot, and I also had to fork https://github.com/Yamakuzure/libdshowcapture.git, https://github.com/Yamakuzure/obs-browser.git, https://github.com/Yamakuzure/ftl-sdk.git, and https://github.com/Yamakuzure/obs-websocket.git for everything to work.

(Note: Could we please get rid of that 1980s line length limit? It was really odd to work with an IDE that was two-thirds empty. Also, the 80-character line length limits cause really nasty line breaks which make stuff hard to read. Thanks.)

(*) There are many commits being needed, so these can not be "cherry-picked", sorry.

I know these are many many changes, but at least the whole suite now compiles fine with -Wall + -Wextra + -Werror on GNU GCC and with /W4 + /WX on Visual Studio 2022.

Also, there is stuff I could not test. I have no DeckLink card, and I could not get AMF to compile. I do not have an AMD card, so I could not have tested it anyway.

I have rebased on 29.1.0-beta4, but everything together is 95 commits now. So I either squash what can be squashed (some cleanup would be good, too) and put out a PR, which I would have to do for the submodules simultaneously, or the most relevant/interesting commits get cherry-picked, which might be tedious work to do. Or the fixes get "re-created".

Thank you very much for your patience and support!

Yamakuzure commented 1 year ago

I wanted to make sure that this is not a dud, so I cloned the repo again and built 29.1.0-bet4 without my patches.

What shall I say, 56 clips later it is clear that somewhere between beta3 and beta4 the issue got fixed.

I then went back to 29.0.2, where the issue still existed, applied my patches, and the issue was gone.

So, to cut this short, it is better to go with the official fix, whatever that was, and forget about my fork. ;-) (Unless you find some of my commits helpful, like the massive improvement to the logging. :grin: )

And no, I am not salty that someone "beat me to it", although I invested several hundred hours into this. I am just happy that I can film again without any problems. The last 6 months filming was a massive pain and the random freezes cost me so much time and nerves, I am so so happy that this is over.

Thanks again for your patience and a huge "Thank You!!!" to whoever fixed it!