maurossi / linux

android-x86 kernels
Other
57 stars 28 forks source link

Superfluous patches #15

Closed mirh closed 4 years ago

mirh commented 4 years ago

https://github.com/torvalds/linux/commit/4d19c487555e8fe6e40f645c17e12cc30d4a18bf https://github.com/torvalds/linux/commit/5ca4d1ae9bad0f59bd6f851c39b19f5366953666 https://github.com/torvalds/linux/commit/fac01d11722c92a186b27ee26cd429a8066adfb5 (AFAIU) https://github.com/torvalds/linux/commit/ddcccd543f5dbd841fe305452651b0f8c1d74f0f (from 5.8) https://github.com/torvalds/linux/blob/v5.7/drivers/hid/hid-asus.c#L696 And last but not least https://github.com/torvalds/linux/commit/a75d035fedbdecf83f86767aa2e4d05c8c4ffd95 should be the fix for bay trail hanging issues (or well, at least putting aside that ~5.6 pissed off just about everybody with gen7)

With this said.. Do you have any plan to re-send DC for SI to amd-gfx? And OPEMU is working for linux, albeit in a very rough way for the moment.

maurossi commented 4 years ago

Hi, thanks for the info

My personal linux branch is for my testing, the main purpose is to see if android-x86 boots other purposes have low priority for me

When commits merged to linux git they will disappear while rebasing or will be remove if they conflict and I see they are not needed anymore

If you are interested in a more stable kernel, please check in android-x86 repository for kernel-5.4

The AMD DC for SI cannot be sent to amd-gfx because they lack essential requirements asked by AMD developers, like for example using the dce6 headers for register and masks and they also lack VGA connector support, so they cannot be merged in their current state

With kernel-5.8 some changes were introduced and screen is not turning on with HD7950, still to be investigated

Regarding instruction emulation the expert was Wu Zhen but he's not involved with android-x86 development anymore, you may contact Chih-Wei Huang when you have a stable instruction emulation, last time we looked into this Opemu was lacking 32bit support

At the moment pushing to have AMD64 target is not a priority for him I think, but it could be interesting

maurossi commented 4 years ago

Closing as I do not consider this an issue

mirh commented 4 years ago

With kernel-5.8 some changes were introduced and screen is not turning on with HD7950, still to be investigated

Duh, that's also my situation (but it's a bit disheartening considering nobody seemed to give a damn about my earlier reports) My random guess though, was that DC could have been pretty related to that, since 5.6 started to show a warning about its failed activation attempt every now and then.

and they also lack VGA connector support,

I don't think they consider this a (hard at least) blocker: https://github.com/torvalds/linux/commit/d9fda248046ac035f18a6e663f2f9245b4bf9470

last time we looked into this Opemu was lacking 32bit support

My target was actually SSE4.1 on x86_64 (and the branch with full support for that should be also supposed to handle all (S)SSE3 with SSEPlus). Too bad I cannot get it to link.

maurossi commented 4 years ago

With kernel-5.8 some changes were introduced and screen is not turning on with HD7950, still to be investigated

Duh, that's also my situation (but it's a bit disheartening considering nobody seemed to give a damn about my earlier reports)

I have seen that you saw the issue with HD7750, just a question do you have the problem also with linux git build of kernel (not with my dce patches)?

In any case I will post a message to the attention of AMD developers because I see it with Athlong 200GE too and raven code paths are not touched by my dce60 patches

My random guess though, was that DC could have been pretty related to that, since 5.6 started to show a warning about its failed activation attempt every now and then.

Could you send me info to my email issor.oruam@gmail.com and the logged error?

and they also lack VGA connector support,

I don't think they consider this a (hard at least) blocker: torvalds/linux@d9fda24

If it causes issues to end users it is a problem for AMD and also I doubt that they would push to drm-next with that issue, Alex D. shared info that some DAL branch had an experimental VGA support, more info in the attachment.

For DCE6 patches eligibility to amd-gfx ML resubmision, also the correct support Watermark priority registers would be required, another issue is that in around 10% of 'modprobe amdgpu' there is a white shadow reflection in the lightened pixels, which may be due to some race conditions or to some difference between DCE8 and DCE6 registers.

AMDGPUs are gigantic state machines with registers in the case of SI my patches in dce/dce60 are only working with dce80 headers, I think that dce60 headers may lack some of the dce8 registers, let's say even just one, the "dce/dce80 sources/functions were cloned and adapted as dce/dce60" and there is an ASSERT() in REG writes code that causes fatal crash, but the effect is just that monitor screen will not turn on. Debugging requires tracing amdgpu_dc_wreg or to try to relax the ASSERT() to something less fatal

In the attachment the full backlog of history, with notes and info exchanged with AMD developers

What_is_missing_v5.txt

last time we looked into this Opemu was lacking 32bit support

My target was actually SSE4.1 on x86_64 (and the branch with full support for that should be also supposed to handle all (S)SSE3 with SSEPlus). Too bad I cannot get it to link.

You may send me the instruction on how to build integrate OPEMU and the build error, in order to see, I'm not an pro coder though

mirh commented 4 years ago

Could you send me info to my email issor.oruam@gmail.com and the logged error?

Ehrm.. Well, if even raven is fucked up, I guess like it's not DC. And the warning I was getting in 5.6 is probably just somebody using amdgpu_device_asic_has_dc_support not remembering that it's not just a "quiet check".

What_is_missing_v5.txt

I can't help you with registers (sorry), but I can tell you that UVD has not to wait for anything. See more details here and here.

You may send me the instruction on how to build integrate OPEMU and the build error, in order to see, I'm not an pro coder though

You just literally check out my repo and hit make. I have a branch that just does popcnt and a few other thing perfectly (without SSEplus, and without crashes in the worst case scenarios.. unlike the thing you are currently shipping) and then another with full SSEplus hooking everything.. But it doesn't build.

maurossi commented 4 years ago

Besides the warning you've been seeing with HD7750, with kernel 5.8 the AMD DC support that was working with Kaveri and Kabini by simply reverting torvalds/linux@d9fda24 is not working anymore.

At 'modprobe amdgpu' command the display turns off.

The same problem is happening with Southern Island parts with my DCE6 patches

The solution will probably solve for all Kaveri, Kabini and SI

mirh commented 4 years ago

Interesting. In this case.. I guess like somebody upstream could give you a hand too? After 20 months perhaps they may also have some time to work on it themselves.

maurossi commented 4 years ago

Hi mirh,

I have opened a "Kind request for info" https://gitlab.freedesktop.org/drm/amd/-/issues/1170 to try to indentify the reason of Kaveri, Kabini DC not working anymore with kernel 5.8.0-rc1

With that info, if problem solved for Kaveri/Kabini, I hope that by pure luck the DCE6 patches will work again, maybe

In the meantime I will play with removing the fatal ASSERT(mask != 0) and uncommenting the #define DM_CHECK_ADDR_0 to try if that will allow to light up the screen or to log the nature of problem.

Mauro

mirh commented 4 years ago

Ehrm.. For the records, I just tested your kernel-5.8rc1_si_ylng branch (with some little config adjustment to make systemd happy in manjaro, but nothing really major). And it boots nicely on my 7750, with the exception of the warning I reported above. Was there something special to do to trigger DC?

maurossi commented 4 years ago

In kernel-5.8rc1_si_ylng branch I have reapplied torvalds/linux@d9fda24 to be able to boot with amdgpu

In order to trigger the problem you need to add grub cmdline amdgpu.dc=1 or load the module with modprobe amdgpu dc=1

Alex Deutcher requested to get dmesgand if possible xorg.log, since I have not built and booted on linux distro, could you do me a favour and see if it easy for you to collect those logs with amdgpu.dc=1 ?

Thanks Mauro

mirh commented 4 years ago

Duh, right, I hadn't noticed that parameter is also required for SI. And I hadn't noticed that you had not set CONFIG_DRM_AMD_DC_SI.

Black screen for me too, indeed. I'll try to see what I can get...

maurossi commented 4 years ago

Duh, right, I hadn't noticed that parameter is also required for SI. And I hadn't noticed that you had not set CONFIG_DRM_AMD_DC_SI.

Sorry, I forgot to mention that, I usually keep that option as not enabled in the default defconfig files

Black screen for me too, indeed. I'll try to see what I can get...

Thanks a lot for your help, I would have never catched the problem, as my bisecting with android takes ages, how did you bisected with linux?

mirh commented 4 years ago

Well, it's pretty easy on arch actually (even though having to still keep some commits on top was some crazy ride this time)

I just picked up the linux-amd-staging-drm-next-git PKGBUILD, adjusted branches and urls accordingly to your repo Then I gave a good git diff 0cc3b89cef4f remotes/origin/kernel-5.8rc1_si_ylng > patch.patch And I added patch -Np1 < "$startdir/patch.patch" in the build() section Where I also moved in the "setting" parts of prepare() Last but not least a touch of options=('ccache') to speed up everything EDIT: oh, and then I copied _android-x86_64defconfig and I made sure to enable CONFIG_*_NS, CONFIG_DEVTMPFS, CONFIG_AUTOFS4_FS and CONFIG_DRM_AMD_DC_SI.

And with these three commands (and some little manual intervention and porcodio here and there when the patch doesn't apply or build cleanly) you basically own the world

git reset --hard && git clean -fd
git bisect *what-you-got*
makepkg -efi

Please tell me that the commit I found actually makes sense because I really want to leave this dreadful experience behind me.

maurossi commented 4 years ago

I searched and tried to find a way to perform bisect on a detached branch (with youling257 and my android patches) but I could not find a way, now I'm building with Alex patches to see if they solve

maurossi commented 4 years ago

Indeed reverting the bad commit you found was key to fixing the issue BTW are you italian? I am italian too :-)

mirh commented 4 years ago

Sì, qualcosa mi dava l'impressione 🙃

maurossi commented 4 years ago

Hello mirh, Continuing in english language for the audience

Substantial progress

https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_WIP7_v5

git cherry-pick 0fd671e1ad04b..449525b92d4456b (in case you want to try it)

Improvements:

Display not turning on with DCE6 specific code paths: SOLVED Mouse cursor size issue: SOLVED Display "white snow": SOLVED Lack of LVDS/VGA DAC and connectors support: NOT A PROBLEM (a pre-existing lack of DC in itself) :-)

In the attachment the dmesg log of Ubuntu 20.04 booting with kernel 5.8.0-rc4 CONFIG_AMD_DC_SI=y

As soon I'm done with the two remaining WARNINGs with backlog trace, I'm almost ready for re-spinning the series with commit messages/documentation to amd-gfx gitlab

Cheers

Mauro

dmesg_HD7950_Ubuntu_20.04_kernel-5.8.0-rc4_CONFIG_AMD_DC_SI=y.txt

mirh commented 4 years ago

Cool. Though my 7750 became a 290 last week, so I can't help with bisecting anymore in this circumstance. On the other hand, I guess like I'm always up for OPEMU.

maurossi commented 4 years ago

No need to bisect, because the problem was in my code, now it's fixed

Thanks for the previously provided help

Mauro