xenia-project / game-compatibility

https://xenia.jp
BSD 3-Clause "New" or "Revised" License
467 stars 145 forks source link

4B4F07FE - Fist of the North Star: Ken's Rage #735

Open cesys opened 7 years ago

cesys commented 7 years ago

Marketplace

Tested on (https://github.com/benvanik/xenia/commit/92a4b90ed0fceeb9240a1436d3ffe1d9164c358a)

Issues:

Intro and Menus are ok. On Vulkan, the game crashes as soon as the 3D kicks in because of some missing shader translations. On OpenGL, the game goes into the 3D playable part, but there are a lot of missing shader translations. Same graphic issue as Dragon Ball Z Ultimate Tenkaichi and Dragon Ball: Raging Blast 2. Reds and Blues are swapped when using Vulkan during the OPENING.wmv video. Due to missing connection with the blitter that swap the Red and Blue channels through the shader.

Log:

OpenGL Vulkan

Screenshot(s):

OpenGL Video taken from YouTube. Didn't post it personally. Vulkan vulkan

Labels:

state-menus gpu-corrupt-drawing (on Vulkan) gpu-shader-errors

cesys commented 7 years ago

Debugging Vulkan case (OpenGL is negligible since the support will be probably dropped). Instrumented command_processor.cc#bool CommandProcessor::ExecutePacketType0(RingBuffer* reader, uint32_t packet) with these logs:

!> 00000004 XXX: packet -> FFFF -> 42100
!> 00000004 XXX: count -> FFFF -> 5
!> 00000004 XXX: reader->read_count() -> FFFF -> 83D4
!> 00000004 XXX: count * sizeof(uint32_t) -> FFFF -> 14
!> 00000004 XXX: base_index -> FFFF -> 2100
!> 00000004 XXX: write_one_reg -> FFFF -> 0
!> 00000004 XXX: reg_data -> FFFF -> FFFFFF
!> 00000004 XXX: target_index -> FFFF -> 2100
!> 00000004 XXX: m -> FFFF -> 0
!> 00000004 XXX: reg_data -> FFFF -> 0
!> 00000004 XXX: target_index -> FFFF -> 2101
!> 00000004 XXX: m -> FFFF -> 1
!> 00000004 XXX: reg_data -> FFFF -> 0
!> 00000004 XXX: target_index -> FFFF -> 2102
!> 00000004 XXX: m -> FFFF -> 2
!> 00000004 XXX: reg_data -> FFFF -> FFFFFF
!> 00000004 XXX: target_index -> FFFF -> 2103
!> 00000004 XXX: m -> FFFFFF -> 3
!> 00000004 XXX: reg_data -> FFFFFF -> F
!> 00000004 XXX: target_index -> FFFFFF -> 2104
!> 00000004 XXX: m -> FFFFFF -> 4

Seems like we're overwriting the register at 0x2103 (XE_GPU_REG_VGT_MULTI_PRIM_IB_RESET_INDX) with 0xFFFFFF. This is triggering the assertion in pipeline_cache.cc: assert_true(regs.multi_prim_ib_reset_index == 0xFFFF || regs.multi_prim_ib_reset_index == 0xFFFFFFFF);

Digging deeper. If somebody has a clue on why this is happening, please let me know.

cesys commented 7 years ago

Fixed issue for Vulkan. We should allow 0xFFFFFF as it's a valid index. Vulkan doesn't allow directly to specify the index, it just allows to enable primitive reset determining automatically the index depending on the type of VkIndexType. 0xFFFFFF is allowed by OpenGL as index, that's were the inconsistency happens. After bypassing the assert, I'm hitting the same bug I hit for OpenGL (in OpenGL I don't hit the assert because glPrimitiveRestartIndex allows 0xFFFFFF). The next bug seems to be and issue with type 0 packets. For some reasons, the ring buffer get corrupted and instead of getting a type 3 packet I get a weird type 0 packet which doesn't parse. Investigating why this is happening.

DrChat commented 7 years ago

We have that assert installed in pipeline_cache.cc because we haven't yet setup primitive reset in the Vulkan graphics system.

As for CP corruption - that sounds interesting, please do keep us posted.

cesys commented 7 years ago

What's I'm saying is that primitive reset is actually there. Vulkan doesn't allow to specify the index because it doesn't need it. It just allows to set the enable/disable flag, which is already properly set in the current implementation of pipeline_cache.cc. Based on the value of VkIndexType Vulkan automatically sets the right value of the index which is either 0xFFFF in case of 16 bit or 0xFFFFFFFF in case of 32 bits. However, sometimes we get some PM4 packets that try to set the index to 0xFFFFFF which is allowed as an index in OpenGL (that's why OpenGL version doesn't hit this first bug (there's no assert there anyway)). My point is that we should add the condition: || regs.multi_prim_ib_reset_index == 0xFFFFFF to the assert and rely on: if (regs.pa_su_sc_mode_cntl & (1 << 21)) { state_info.primitiveRestartEnable = VK_TRUE; } else { state_info.primitiveRestartEnable = VK_FALSE; } for the primitive reset, which should be enough providing VkIndexType is set accordingly (not 100% sure about this last bit). In the meanwhile, I'm working on the CP corruption. As soon as I have a patch (or a full root cause analysis) for both issues I'm gonna share it for review.

DrChat commented 7 years ago

With primitive restart enabled, Vulkan treats a special index value (2^(n bits index) - 1) as a primitive restart rather than an ordinary index. We don't insert any indices that will trigger this functionality when submitting a drawcall, so primitive restart does not work at the moment.

cesys commented 7 years ago

As far as I know that the index is VkIndexType (the one I was talking about earlier). We do the following in vulkan_command_processor.cc VkIndexType index_type = info.format == IndexFormat::kInt32 ? VK_INDEX_TYPE_UINT32 : VK_INDEX_TYPE_UINT16; Not sure that's enough though. I'm not comfortable with the code yet. Of course, I trust your judgement on this. BTW, I'm not able to post any message on xenia-dev. Do you know why?

cesys commented 7 years ago

In WorkerThreadMain, at the line: uint32_t write_ptr_index = write_ptrindex.load(); the write_ptr_index ends up being broken and that broken value is passed to ExecutePrimaryBuffer. I logged the values of write_ptr_index: ... write_index : 00000968 write_index : 0000096B write_index : 0000097D write_index : 00000016 <- corrupted value!

After that corruption I get this official log: !> 00000004 ExecutePacketType0 overflow (read count 00005A60, packet count 00007F10) and consequently: !> 00000004 **** PRIMARY RINGBUFFER: Failed to execute packet.

Problem is I don't have any idea on how to understand why writeindex get corrupted. After some debugging I realized that value is populated by a a callback registered at: // Let the processor know we want register access callbacks. memory->AddVirtualMappedRange( 0x7FC80000, 0xFFFF0000, 0x0000FFFF, this, reinterpret_cast(ReadRegisterThunk), reinterpret_cast(WriteRegisterThunk));

but I'm missing the mechanism that triggers the callback.

DrChat commented 7 years ago

write_ptrindex represents the GPU register that holds the write pointer. The game directly writes the register, which triggers an exception that eventually makes its way to GraphicsSystem::WriteRegister.

Although unlikely, are you sure the write pointer didn't just loop around because the ringbuffer is 0x1000 bytes?

cesys commented 7 years ago

Pretty sure it's not the case. Here you have another sequence: !> 00000004 XXX: ExecutePrimaryBuffer -> read_index : 00000BC3 , write_index : 00000BC9 !> 00000004 XXX: ExecutePrimaryBuffer -> read_index : 00000BC9 , write_index : 00000BD5 !> 00000004 XXX: ExecutePrimaryBuffer -> read_index : 00000BD5 , write_index : 00000BDB !> 00000004 XXX: ExecutePrimaryBuffer -> read_index : 00000BDB , write_index : 00000BE1 !> 00000004 XXX: ExecutePrimaryBuffer -> read_index : 00000BE1 , write_index : 00000BF3 !> 00000004 XXX: ExecutePrimaryBuffer -> read_index : 00000BF3 , write_index : 00000016 hardly doubt you're gonna hit a big chunk after 00000BF3 that will force the ring buffer to wrap around. BTW, it always ends up being 16. Do you know which OPCODES are involved? After my analysis, it seems that I should start from OPCODE_LOAD in constant_propagation_pass.cc, but I'm not sure that's the right track.

DrChat commented 7 years ago

Don't worry about the JIT opcodes, the ones dealing with integer arithmetic and similar have been verified.

I've had a similar issue before where the ring buffer wasn't the correct size and it wrapped around incorrectly. That might be worth looking into, though it may be unlikely in this case.

cesys commented 7 years ago

Seems like in: auto graphics_system = kernel_state()->emulator()->graphics_system(); graphics_system->InitializeRingBuffer(ptr, log2_size); the ring buffer is initialized to be 0x8000 bytes meaning 32k so we're definitely far from wrapping around... Unless the dimension value get corrupted.

cesys commented 7 years ago

Found the issue. In this specific title the ring buffer got reinitialized at least twice with: InitializeRingBuffer Immediately after the second initialization, the game crashes because read_ptrindex is not reset to 0. Adding: void CommandProcessor::InitializeRingBuffer(uint32_t ptr, uint32_t log2_size) { + read_ptrindex = 0; primary_bufferptr = ptr; primary_buffersize = uint32_t(std::pow(2u, log2_size)); } makes the emulator work for some more cycles, until I hit another issue: Expression: furthest_usage->use->instr->block==block Going ahead with the analysis.

cesys commented 7 years ago

Commenting the assert makes the game go a little further (it plays another short WMV). After that I get a shader translation exception. Which means we're approaching the 3D part. With OpenGL the emulator actually goes into the game. Vertex shader doesn't seem to do much but the pixel shader is okkey (2D images are there).

DrChat commented 7 years ago

Submit a PR with that read_ptrindex fix and I'll pull it

As for that assert - that looks rather serious as I haven't encountered that before.

cesys commented 7 years ago

Ok, will do.

cesys commented 7 years ago

PR at: https://github.com/benvanik/xenia/pull/684

cesys commented 7 years ago

Investigating the next assertion (not sure if it's correct to keep doing that in the context of this issue). If I bypass that assert, the next issue I get is related to the missing implementation of: kGetTextureGradients It would be good to get some hints on how to cope with shader translation (unless somebody is already developing that specific translation).

jackchentwkh commented 7 years ago

In my opinion, you're doing the right thing. To focus on one major title and to solve the issues step by step is a efficient way to make a emulator work. really good job, cesys.

cesys commented 7 years ago

Thanks you, jackchentwkh. Definitely agree with you on that. The motivation of making a title you like work is strong. However, I'm facing two major issues on which you may be able to cast some light:

jackchentwkh commented 7 years ago

well, I am not in the xenia-dev group, so your questions are my questions, too. But you did a better job then I did. I am still trying to understand the code of xenia.

cesys commented 7 years ago

Since we have: assert_true(furthest_usage->value->def->block == block); assert_true(furthest_usage->use->instr->block == block); that means that every time we spill a register we should have the following condition:

furthest_usage->value->def->block == furthest_usage->use->instr->block == block (1)

I added this check (1) in DumpUsage and I noticed that the condition (1) is violated multiple times during the register_alloc_pass workflows. However, the assertion happens only when, after the violation, the TryAllocateRegister fails and then the Run procedure tries to call SpillOneRegister (this sequence occurs in this title). In the regular workflows, instead of calling SpillOneRegister the Run procedure calls PrepareBlockState (it restarts) and the violation disappear as everything is cleaned up.

In bool RegisterAllocationPass::Run(HIRBuilder* builder) if you retrieve the list of instr starting from the builder and you perform the check (1) on each instruction of the list you may hit the violation. Therefore, the violation is already present in the list of instr associated to the blocks of the builder.

Not sure if the violation is supposed to happen sometimes or it should never happen. In the first case the assertion are not justified. Also, commenting the assertions, everything goes ahead smoothly till the 3D part of the game (which lacks of shader translation and crashes after a while, but for internal reason -> in game crash with the black/green window).

Does somebody know what's going on?

cesys commented 7 years ago

Found the issue. Altivec instruction InstrEmit_stvrx seems to have an issue when calling f.BranchFalse(eb, skip_label);. This will force and EndBlock which will set currentblock = NULL; When the altivec instruction goes ahead calling: ea = f.And(ea, a); the call: Instr* i = AppendInstr(OPCODE_AND_info, 0, AllocValue(value1->type)); will assess that currentblock is NULL and will call AppendBlock(); creating a new block that will create the inconsistency between the correct "value->def->block" and the wrong "use->instr->block" as the call: i->set_src1(value1); will set this new block for the variable "use". Hard to build a fix, until I really go deep to the root cause. This code doesn't seem to be straightforward.

DrChat commented 7 years ago

Okay, I can see what you mean here: Value ea is calculated at the beginning of the instruction, and we create a branch if (ea & 0xF == 0).

Problem is ea is used in a different block than where it was created.

Very detailed analysis, awesome work!

cesys commented 7 years ago

Thanks, DrChat. Do you have a fix in mind? I tried to remove the EndBlock from the BranchFalse (only for that call) but the system crashes. BTW, that's the only case were we call BranchFalse, apart from InstrEmit_branch (were it's supposed to be called) so I guess the BranchFalse pattern is not yet suited for this case.

cesys commented 7 years ago

PR for the fix on Altivec instructions at https://github.com/benvanik/xenia/pull/687

cesys commented 7 years ago

Working on the Vulkan shader issue related to reds and blues swapping (the issue is present in multiple titles, as I could read in the different compatibility issue descriptions).

Irixion commented 7 years ago

There's the colour swap, as well as the "half screen" issue a lot of games suffer from. Not sure if they're related. Out of curiosity, are you working on an AMD or nVidia card?

Edit: This fix fixes the Sonic 06 demo crash on Vulkan now. Only thing working still is just the HUD elements.

cesys commented 7 years ago

I'm with an nVidia GTX 980. With this title and with my system, I only get the color swap issue. I tried with Minecraft Story Mode and in that case I don't get any of these issues (except that the game freezes when entering the 3D part). Happy to know this fix is useful for other titles, though.

DrChat commented 7 years ago

@cesys If you're working on the color swap issue, I can fill you in on some details:

The Xenos renders to the framebuffer in a BGR format, whereas Vulkan renders RGB. So, when the game resolves the framebuffer to a texture, the colors are stored in the opposite format. We can fix this by manually blitting any resolves and swapping the color channels during that blit.

Basically, I need a blitter setup here for Vulkan. If you want to work on that, it's all yours!

Irixion commented 7 years ago

I suppose the colour issues on AMD cards is unrelated. Though from the AMD Vulkan issue, I can guess that most people are with team green and not team red.

DrChat commented 7 years ago

@Irixion Nope - it isn't related to the type of card. I've got an AMD card myself, so AMD will have first-class support now.

cesys commented 7 years ago

@DrChat, if that's the case what is not clear to me is why the swapping is occurring only when the WMV is played and not during the menu navigation, for instance. Does that mean that I need to do the channel swapping only for specific situations and that for other cases it's already ok?

DrChat commented 7 years ago

Sometimes the game will resolve to a texture and swap using that texture as a source, which means no swapping occurs.

The color swapping only occurs when the game renders using a resolved texture as an input texture to a shader. Swapping the channels backwards in the texture handling code would be a giant hack, but using a manual blitter is a proper fix.

cesys commented 7 years ago

Noticed that for Opegn GL the swapping has been done in the Intialize method: if (!swap) oC = oC.bgra; I'm assuming this is true also for Vulkan. Which means I have to work on the shader: blit_color_frag to force the same behavior. EDIT: After further analysis, I realized that the shader for Vulkan contains the logic for swapping already. The logic depends on a 'swap' flag, like for the OpenGL shader, but apparently the flag is never True. Working on the root cause.

cesys commented 7 years ago

Something weird is going on. Even if I modify blit_color.frag to force the shader to swap the channel, that does never happen. Which means, blit_color is not triggering in my case.

DrChat commented 7 years ago

The Vulkan color blitter is currently not set up.

cesys commented 7 years ago

Now I understand what you meant with "Basically, I need a blitter setup here for Vulkan.". Sorry about that, not really aligned with the terminology 100% yet. At this point, my understanding is that you need to wire the Vulkan blitter to the vulkan_command_processor and to the texture_cache for Vulkan, like it's already done for OpenGL. Is that right?

DrChat commented 7 years ago

Just to the Vulkan CP under IssueCopy. As of now, it blits from the RenderCache using vkCmdBlitImage.

I can work on wiring up the Vulkan CP if you want to focus on getting that blitter working (I've been pretty lazy about working on that :)

cesys commented 7 years ago

So what you are suggesting is to go at this line in render_cache: vkCmdBlitImage(command_buffer, tile_view->image, VK_IMAGE_LAYOUT_GENERAL, image, image_layout, 1, &image_blit, filter); and add a stage before that to swap B and R? That's gonna do the swapping for each frame.

DrChat commented 7 years ago

Nope! What I'm suggesting is we render to the target texture in a new renderpass, and swap the two channels in the pixel shader.

cesys commented 7 years ago

Reading Vulkan tutorials and trying to understand the code to do the integration. This stuff is a mess...

cesys commented 7 years ago

@DrChat, okkey here's the first thing: if I understand correctly, your idea is to substitute VkCmdBlitImage in render_cache.cc with a new RenderPass and we leverage the shaders that are already in place to do the color channels swapping. Therefore, my idea is to exploit the Vulkan configuration we have in the Blitter class in blitter.cc, make it available for render_cache.cc and use the RenderPass in Blitter to do the color channels swapping. Now, there are few questions:

Still reading stuff and trying to understand the code, but the learning curve is pretty flat at the beginning.

DrChat commented 7 years ago

Nope! We don't want to leverage RenderPass to create a renderpass for blitting. The blitter will track all rendering state necessary to blit images.

The blitter does indeed come with both VS/PS. The real magic will happen in the PS.

We'll have to wrap the VkImages as the API does not allow us to directly sample/draw them.

cesys commented 7 years ago

This stuff is a real mess. The object model of Vulkan is extremely complicated and I really don't know where to start with the integration. Is it possible to have a chat session about this topic so that I clarify some doubts?

DrChat commented 7 years ago

Just ask your questions on IRC. I see you do that, but you keep leaving afterwards - you'll have to idle in there until somebody replies.

cesys commented 7 years ago

Last time I posted a question I waited for more than a couple of hours. Also, as of now, I tried so many times to ask questions there and I never get any answers that it seems silly to keep trying. I can't leave the IRC chat opened 24/7, unfortunately.

Parovozik commented 7 years ago

Ingame on current builds (Vulkan). Visible only HUD and some lighting effects xenia fist of star 1 vk mp4_snapshot_02 47_ 2017 04 12_03 18 01 xenia fist of star 1 vk mp4_snapshot_02 47_ 2017 04 12_03 18 22

jackchentwkh commented 7 years ago

you have to use a 7-24 client for IRC in order to keep all the conversation logged. That's not possible for everyone. I have the same problem, too. on the other hand, github has gitter_im, which is similar to IRC, but you don't have to be online for 7-24, all messages will be logged. Benvanik has a gitter im at Benvanik/Xenia, why not use that? or create a new one?

cesys commented 7 years ago

@jackchentwkh , I'll try that but it seems to unrelated to the xenia-dev IRC channel. @Parovozik , which build are you using. In the latest build you should get an assertion about some missing shader translations before entering the ingame. Did you commented the assertion an recompiled? Anyway, that's where I've got too after commenting the assertion, but before taking a look to the missing shader translations I wanted to get familiar to Vulkan/Xenia integration through this reds/blues swap issue.

cesys commented 7 years ago

@DrChat , any news about the render pass on VkImages? Did you gathered any thoughts on that?