Open Venemo opened 5 years ago
One thing I noticed though, that aco performance looks just a tiny bit lower than llvm in that case
I think 110.2 vs 108.9 FPS is well within an acceptable margin, esp. considering that we haven't yet started working on any Navi-specific optimizations yet.
Why are in general such "hazards" on Navi?
Every GPU has similar problems, but each generation has their own specific issues. Even though the hardware developers do their best to make sure the hardware works as intended, sometimes these sort of problems still slip in.
Is it due to the need to support backwards compatibility with GCN ISA in RDNA?
The RDNA ISA is not backwards compatible with GCN. It is similar for sure, but you can't run the same binary on them.
Is there a specified way to report test results? I've been playing Tetris Effect (non-steam) with ACO on my Navi 10 and it works beautifully.
You can use the main benchmarks issue (#36), or may be there should be a separate one for Navi for clarity?
@TacoDeBoss Thanks, I appreciate your feedback! I think we currently don't have a better way to report your results, but my collegaues will correct me there if I'm wrong.
The RDNA ISA is not backwards compatible with GCN. It is similar for sure, but you can't run the same binary on them.
Wasn't that the whole point why AMD still support Wave64 on Navi in less efficient way, while natively suggesting to use Wave32? I.e. I thought it's supposed to be able to run GCN targeted shaders fine, but also should work with ones that are RDNA only. May be that kind of combination could potentially increase the probability of such hazards.
But anyway, I got your point about all GPUs potentially having such hardware issues.
I tried SteamVR home with aco-navi, which does not hang. But I noticed the whole scene looks much dimmer (lower brightness).
@yshui Is this a navi specific problem?
Most GPU hangs look to be fixed by the hazards updates, at least the games I have tested. Dragon Quest XI looks to be completely hang free, at least in the places where LLVM still hangs.
Wasn't that the whole point why AMD still support Wave64 on Navi in less efficient way, while natively suggesting to use Wave32? I.e. I thought it's supposed to be able to run GCN targeted shaders fine, but also should work with ones that are RDNA only.
Well, it depends on what you mean by "compatible". What I meant was that a compiled shader that could run on Polaris or Vega can not run unmodified on Navi, so they are not compatible in this sense. This is not a suprising fact though, every GCN generation has (sometimes subtle) differences that makes it necessary to adapt the compiler to those generations.
If you are asking how close are the RDNA and GCN ISA, I think the difference between last gen GCN and RDNA is not that big. AMD could have called it GCN 6th gen, but maybe the marketing department decided they could use a refreshed branding. In case you are curious about the differences, you can take a look at the GFX10 commits in mesa.
May be that kind of combination could potentially increase the probability of such hazards.
The hazards are usually a result of some rare corner case that the hardware designers forgot to take care about. Most of them fall into either of two categories:
But anyway, I got your point about all GPUs potentially having such hardware issues.
Modern hardware is very complicated. It's not only GPUs that have problems, but everything else, eg. also CPUs. You can see a news item almost every week that talks about how some complicated sequence of instructions can cause unintended side effects; or how some instructions are unreliable on some processors.
@aqxa1 Thanks for the feedback, it's nice to know that I'm heading in the right direction. Unfortunately those hazard mitigations are not perfect yet, and they don't yet take care of every possible occourence of the hazards. I plan to continue improving it next week.
Further testing TW3, I found some anomaly with shadows:
llvm:
aco:
Save at that place: tw3_save_shadows_bug.zip
You can move forward using aco, and notice that shadows are missing in some distance.
@Venemo Sorry, but I am not sure. I don't have other hardware I can test on.
@shmerl @yshui Since there have been other, similar reports from users on Polaris and Vega, I am inclined to think that these rendering glitches are not specific to aco-navi.
Since the discussion in this issue is starting to get very unwieldy, can I ask you guys to please file separate issues about these rendering glitches? (Make sure to still mention that you experienced the problem on aco-navi, it is useful to know.)
I would like to keep this issue about just "getting things to work" on Navi, in other words, ensure that most games can start and play without a hang or a crash. Once that's done, I will start looking at possible optimizations and individual rendering glitches.
@aqxa1 Can you please take a look at the latest aco-navi and see if you still get any hangs?
@aqxa1 I updated my tests. I didn't test all of the games that were already fixed as of your previous hazard update, so I can't say if there are any regressions.
I will say that everything looks good, and Retroarch (Mednafen PSX HW) is the only application that seems to still be hanging. And it hangs with LLVM as well.
See my previous comment.
Thank you @aqxa1 I'm happy to hear that. The hazard mitigation series has now been reviewed by Daniel and Rhys, so I'm gonna land those upstream. With that taken care of, only the subgroups MR remains before we can enable the Navi support in ACO by default.
Can anyone post a guide or a link to guide of compile this branch for Steam on Ubuntu 19.04 x64? Steam needs 32-bit drivers/libs too.
@Atrosha You can just wait a day or two, as it is going to be merged into mainline Mesa master - https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2471
@Atrosha You can take a look at my trivial script to build both 64-bit and 32-bit mesa with amdgpu support for Debian and Ubuntu. I am not sure it will work on Ubuntu, but it should be easy to modify.
https://gist.github.com/baryluk/1041204eff4cc4fad6f1508afe67b562
Compiling 32-bit on 64-bit is a huge mess in Debian testing at least, unless you want to do it on 32-bit system.
For regular 64-bit build (on Debian, but could work for Ubuntu possibly), see: https://gist.github.com/shmerl/f4e5f76871239158cf083e37c5da56f4
To build Mesa-aco-navi, use it like this:
mesa_branch='aco-navi' mesa_repo='https://gitlab.freedesktop.org/Venemo/mesa.git' src_dir="${HOME}/build/mesa-aco" mesa_dir='/opt/mesa-aco' mesa_debian_build.sh
@baryluk : I looked at your scripts. Interesting idea with using gcc-i686 toolchain. I can try that, since multilib method is totally borked.
@shmerl It totally does work.
It should work on both Debian and Ubuntu. It took some time to figure out exactly which packages to install, and how to use proper pkg-config
for 32-bit builds, env variables and cross files, but once you know it, it is straight forward. The only missing part for 32-bit build is libclc / libclang, but it is really needed just for OpenCL and even if enabled it will not work (there are other bugs in OpenCL implementation in Mesa). Filled a bug upstream - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=941755
Other than that I use it daily to build mesa and 64-bit and 32-bit applications works just fine. I didn't want to make the script "too configurable", use Makefiles or functions (I don't see a point of having a function if it is only going to be called once), to make it straight forward to mess around, understand how it works or modify quickly.
I updated a script a little to be much easier to select LLVM and GCC versions, and made some package installation for amd64 and i386 less redundant and share common package selection automatically.
https://gist.github.com/baryluk/1041204eff4cc4fad6f1508afe67b562
Building aco-navi branch is one line change.
I already filed a similar bug before: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890801
Anyway, apparently cross compiler has better situation than multilib one, so I'll try to update my script to use the former.
@baryluk at first I change Debian to LLVM repo in Ubuntu. Script uninstalls glslang-dev
both: i386 and amd64 while compiling. And it needs to install gcc(g++)-9-i686 manually, coz installs version 8 by default.
So after successful compile enable-new-mesa-opt.source
doesn't work on Ubuntu.
@Atrosha I see. I updated the script a bit to install explicitly specific versions of gcc now.
glslang-dev is not required to build Mesa. I don't have it installed on my OS actually. But the script shouldn't make it remove it. In Debian and Ubuntu glslang-dev and glslang-tools package essentially have no dependencies so should not conflict with anything, even weird LLVM versions, and apt should never decide to remove it. So I don't know how it is possible it did remove glslang-dev.
The issue is probably somewhere else.
Could you download a new version from the gist and try again?
@shmerl My script is still kind of not formally correct , because it actually uses 64-bit llvm-config for 32-bit build. But fortunately both 32-bit and 64-bit versions do output same linker and compiler flags, so it does work. If they would output different include paths, it would not work. But exactly same headers are used for both, so it is not a problem. AFAIK llvm-config doesn't understand multiarch, and it is not possible to install both 32-bit and 64-bit versions at the same time, and I wouldn't expect them to support that either.
I just install one version at a time, and then remove it and install another one for different bitness build. But last time even doing that, co-installation of various stuff with impossible to get a consistent result using gcc-multilb. I'll try with cross compiler, to see how it fares.
You can specify 32-bit llvm in "native" file for meson.
@shmerl I never tried multilib. I just install cross-compilers, way easier and reliable, not to mention you can build also windows and arm binaries this way too. No you can't specificy 32-bit llvm-config in native file for meson native file because there is no 32-bit llvm-config available and co-installable with 64-bit llvm-config. But it is not a big deal. meson does have some disctinction between native and cross files, but I don't really use native. In fact I only use cross files even when compiling natively. But not to select architecture specific things, but to select llvm10 instead of other llvm versions possibly installed on the system.
No you can't specificy 32-bit llvm-config in native file for meson native file because there is no 32-bit llvm-config available and co-installable with 64-bit llvm-config
That's why I said, I remove one before installing the other. Not a big deal really. If co-installation of other stuff works, it should be OK.
@shmerl Ah, yes. You can do that of course. I just find installing and removing one then another version constantly a bit annoying.
Anyway, to avoid noise in this issue, I recommend anyone who needs help with building Mesa, to move the discussion here.
@shmerl Agreed. Totally off topic.
@shmerl @baryluk Sorry for my late reply, but if it's any help to you, here is a writeup about how I build mesa and run some apps using the built mesa: https://gist.github.com/Venemo/a9483106565df3a83fc67a411191edbd (without replacing the system package). This may need adjustments on Debian (such as /usr/lib64 and similar things), but should give you a good idea.
@Venemo: Thanks, I'll take a look. The building part seems to be also using cross compiler for 32-bit, so I'll definitely go with that to finally have a script that builds both 64-bit and 32-bit. Using it without replacing the system packages looks quite similar to the approach I'm using: https://gist.github.com/shmerl/611b5b1670eb5963d0d99a4512ed8674
@Venemo This is essentially what my scripts are doing, just fully automated plus both 64 and 32 bit build, opt and debug, plus not pollute system locations or even require root :D https://gist.github.com/baryluk/1041204eff4cc4fad6f1508afe67b562 Thanks anyway.
As ACO for Navi seems to be enabled with the AUR aco repo now I tested on my Red Devil 5700 XT: SteamVR Blade & Sorcery Gravity Lab Pavlov VR Witcher 3 I'll test some more in the next few days. When generating the shaders it was a little smoother (in VR that's quite noticable) and from there on not much difference, only a little stuttering in B&S was gone, which is nice. No graphics glitches or hangs.
To clarify from where to get latest code for testing Navi with ACO. Should we still be using https://gitlab.freedesktop.org/Venemo/mesa/commits/aco-navi or more upstream https://github.com/daniel-schuermann/mesa/commits/master ?
I recommend to just use upstream mesa which has everything that was in aco-navi, and almost everything that's in Daniel's repo. Depending on your preference with regards to stability, you can go with either 19.3 RC, or master.
EDIT: to reduce the confusion, I now deleted the old aco-navi branch.
What about ACO bug tracking, progress tracking and etc. Will that remain on Github, or also will be moved to Mesa Giltab?
I will respond to any bug you might find in ACO on Navi, regardless of which place you report it. If there is a bug report that I don't notice feel free to ping me.
Not sure how long Daniel wants to maintain this repo, I assume on the long run all of it will move to upstream Mesa. @daniel-schuermann what is your stance on this? Personally, I have a slight preference towards using Mesa Gitlab, because it makes it easier to track which commits fix which problem.
As some packages are based off this repo, we will maintain it for a little longer. It also contains one last performance-improving patch series from @pendingchaos which is not upstream yet, but should land soon. That said, I think with mesa 20.0 we'll fade it out into stable. Bug tracking and progress tracking will probably stay even further than that, but hopefully, there won't be much left at that point.
@daniel-schuermann Maybe announce a closure of accepting NEW issues on this repo (~Jan 2020?), and only deal with existing ones until they are closed. For all the other stuff redirect people to upstream mesa.
We've recently landed Wave32 support in ACO. Those willing to test it can do so using RADV_PERFTEST
. AMD recommends using this mode for compute shaders and the geometry engine. In case of pixel shaders, the old Wave64 mode is recommended since those shaders are more efficient in this mode due to how interpolation and the cache works in the hardware.
In my testing, cswave32
and gewave32
yield a couple of extra FPS in some games. This is not yet fully optimized, Rhys has an open MR which should give it a nice boost, once merged. The pswave32
on the other hand reduces overall performance and is not recommended.
If you want to test, you need to use upstream mesa with RADV_PERFTEST
# The recommended way is to use Wave32 mode for compute shaders and the geometry engine
RADV_PERFTEST=aco,cswave32,gewave32 ./Talos
In the meantime I'm working on a post-RA scheduler (based on Daniel's old code) with the intention of taking advantage of Navi's ILP (instruction level parallelism). The idea is to reorder the instructions (if possible) in such a manner that allows to execute as many as possible in parallel.
There are some issues that I'd like to resolve before opening a MR, such as compilation speed, it currently slows down ACO by about 10% which is not acceptable. However I can already confirm that it has a positive effect, I managed to squeeze out some extra frames per second from a couple of games. Will update you guys with some more concrete benchmarks once the code gets closer to final.
This issue has been quiet for a while, so here is a quick heads up:
@Venemo: Do you see any performance improvements from NGG in your tests?
@shmerl It's hard to say, because so far I've only got NGG VS covered. The "big fish" will be NGG GS. It will definitely provide some improvement, but I don't yet know how much.
However, considering your benchmark results so far, I wouldn't hold my breath that this will fix The Witcher 3 for you. One of us will have to take a close look at that game to determine where the bottleneck is. (I suspect that the problem exists in DXVK, because DXVK on Windows already performs 20% worse than D3D11, but I don't know what the problem is for sure.)
@Venemo: It did get slightly better than llvm on Navi after recent batch of optimizations, but I suspect it's using a lot of tessellation shaders, so it will benefit more once those are implemented in ACO.
@shmerl I ran some benchmarks yesterday with ACO NGG VS together with LLVM NGG VS+GS (as ACO doesn't support the merged VS+GS yet). It did not give me any noticable difference for any game that I tested with, compared to the legacy (non-NGG) pipeline. EDIT: I tested with Talos, RotTR and TW3.
Regardless of the performance implications, we'll probably eventually still need to support NGG, since it's not clear what will become of the legacy path in future chips. However it appears that NGG is not a silver bullet that would magically elevate our frame rates.
I suspect that the bottleneck lies with some very heavy "monster" pixel shaders rather than the geometry engine.
Now that NGG is on in ACO for Navi, should this message still appear?
WARNING: disabling NGG because ACO is used.
@shmerl As noted in the merge request and here, we concluded that NGG gives basically zero performance benefits, so we concluded that it's not worth to rush it into the 20.0 release. Instead we'll merge it in 20.1 when not only VS but also GS+VS is ready. There is already a LOT of improvements in ACO since 19.3 so still plenty to get excited about.
This issue is for tracking ACO's progress on Navi.
What works, what doesn't
All shader stages should work. Every Vulkan game should work.
If you find issues, please file a bug in the upstream Mesa bug tracker.
Tested hardware
Not tested with unreleased Navi cards as we don't have those. If you test with hardware that is not on the list yet, please let us know.
How to test
We suggest using the latest stable mesa, where ACO is the default compiler of the RADV Vulkan driver.
ACO is in mesa since version 19.3 but on old mesa releases, the
RADV_PERFTEST=aco
environment variable was needed.New hardware features support in Navi 1x
New hardware features support in Navi 2x
Possible optimizations
[ ] use round-robin register allocation to avoid WAR hazards (and help any post-RA scheduling)
[ ] schedule ALU instructions (after RA for easier/faster scheduling?)
[ ] choose registers to avoid bank conflicts (either as a reassignment pass or during RA)
See GCNRegBankReassign.cpp in LLVM
[ ] NGG shader based primitive culling