plan about my lima dev - Githubissues

yuq commented 6 years ago

Hi guys, I'm going to share plan of my near future work for lima dev here instead of wiki so that you may know and comment on it.

Currently I want to make the lima_draw.c more generic, robust and support multi render target (FBO), so need feature like dynamic command stream mem alloc and new command submit interface which needs a lot of kernel driver work: new memory management and gpu scheduler. So the plan is:

implement new gpu scheduler by leveraging the 4.16 kernel drm_sched work
implement new memory management which support non-continuous mem, may be support buffer re-size for user space change command stream mem size dynamically
refine lima_draw.c for dynamic mem alloc and management
refine lima_draw.c for new command submit interface and support multi render target

So the work starts from two kernel driver part - mem management and gpu scheduler to user space lima_draw.c refine. Now I'm working on the gpu scheduler part.

anarsoul commented 6 years ago

Sounds good. I'd suggest to create another Wiki page with roadmap.

yuq commented 6 years ago

I'll consider about it. I think x11/wayland desktop support is next goal by implementing all the OpenGL ES2 functions listed by piglit.

PabloPL commented 6 years ago

Also lima needs to flush command stream even if apps are not doing it (piglit is doing this - not calling flush but reading and checking result - glReadPixels?). I think this is also importand, because currently tests are failing (because of this, none cmdstreams are executed).

yuq commented 6 years ago

Right, this will be addressed after multi render target support which will build a command list for each render target. When read from this render target, we can flush the command list attached to it.

enunes commented 6 years ago

I think x11/wayland desktop support is next goal by implementing all the OpenGL ES2 functions listed by piglit.

I've been thinking about working on this direction when I get back from my vacation, to follow up on the parts I worked so far which were more related to that. If I recall correctly we may need to implement some additional buffer sharing possibilities too.

yuq commented 6 years ago

That's gem flink support which is needed by X11 DRI2. This is optional because X11 now prefer DRI3 which uses DMABUF, but for completeness we need it.

yuq commented 6 years ago

For roadmap, here's my thought:

kernel, before we can start upstream our work, following tasks need be done:

gpu scheduler
memory manager for non-continuous mem
drm syncobj support
complete mali450 support

mesa, support X11/Wayland desktop then we can start upstream our work

texture mipmap
FBO support
depth support
pass most OpenGL ES2 piglit tests
compiler work
1. support control flow
2. implement all instructions not implemented yet
3. optimize code generation (optional)
4. tools for debug (optional)
  1. standalone compiler
  2. disassembler, either standalone or integrated into LIMA_DEBUG_SHADER
  3. result validator

prasannatsm commented 6 years ago

Why not send current code as RFC to kernel and get feedback? This will parallelise things and I believe will save a lot of time.

yuq commented 6 years ago

I'll consider this some time latter because current work is not ready, there are some important blocks pending to be implemented which will change the final code shape a lot.

superna9999 commented 6 years ago

@yuq will you wait until 4.16-rc1 to rebase on 4.16 and use drm_sched ?

yuq commented 6 years ago

I want to wait until 4.16 release for the next rebase, currently I just backport drm_sched to 4.13 kernel for the gpu scheduler dev.

anarsoul commented 6 years ago

@yuq do you mean release candidate or release? 4.16 release probably won't happen in next ~3 months.

yuq commented 6 years ago

I mean the final release. not RC. That's about the time I finished the kernel work planed.

anarsoul commented 6 years ago

@yuq I'd suggest starting with 4.16-rc1 - in this way we'll get more testing with 4.16 codebase. But of course, it's up to you. I'll probably be working with 4.16-rc, since I have other patches that I'm planning to submit upstream.

mmind commented 6 years ago

Especially as after -rc1 the big changes are mostly made, so there shouldn't be any surprises and only fixes waiting in the later -rcs :-) .

yuq commented 6 years ago

Progress update:

switch to drm_sched is done
I found the dynamic mem alloc refine for lima_draw.c is independent of kernel memory management refine (u_uploader is the right way), so I plan to work on this first
Rob Clark suggest us don't need to support contiguous memory allocation in lima, we can reference the etnaviv to always allocate dumb buffer in display DRM and export to lima, then the display DRM is responsible for the buffer contiguous or not to meet its needs and lima just need to implement non-contiguous mem alloc. @enunes are you interested in refine your renderonly lib implementation to this way after finishing the index draw?

enunes commented 6 years ago

@enunes are you interested in refine your renderonly lib implementation to this way after finishing the index draw?

Just sent a pull request with the index draw changes. I can take a look into the renderonly lib changes, need to take a look at the etnaviv code to understand better what needs to be done. Can you clarify why these changes are necessary with our current renderonly implementation?

yuq commented 6 years ago

OK, here is my thoughts from the advice of Rob:

Current renderonly lib implementation is allocating all buffer from lima and export to DRM display driver. So lima driver has to handle both contiguous and non-contiguous memory allocation in the kernel driver. These two memory are from different method, contiguous memory is from dma_alloc_xxx which mostly from CMA pool while non-contiguous memory is from alloc_page.

But if only allocate scanout buffer from display DRM driver and export to lima and allocate non-scanout buffer from lima, then lima kernel driver just need to implement non-contiguous memory alloc which simplify the lima kernel MM and maybe can reuse the existing MM method like TTM.

yuq commented 6 years ago

Progress update:

switch to drm_sched (done)
lima_draw refine (done) 1). dynamic MM using u_upload 2). flush command stream before read from a buffer @PabloPL 3). FBO support (tested with https://github.com/yuq/gfx/tree/master/gbm-surface-fbo)
kernel non-contiguous mem MM (next work)

With the FBO test app, seems texture support has some problem @anarsoul Blue/Red channel swap: test app render buffer is PIPE_FORMAT_B8G8R8A8_UNORM, while texture is PIPE_FORMAT_R8G8B8A8_UNORM, the final result is B/R channel swap

anarsoul commented 6 years ago

@yuq I'll take a look. Can I reproduce it with gbm-surface-fbo?

yuq commented 6 years ago

Yes, you can.

anarsoul commented 6 years ago

I opened https://github.com/yuq/mesa-lima/issues/32 to track this issue

yuq commented 6 years ago

Thanks, the right result should be red, but on my Allwinner H3 it's blue.

yuq commented 6 years ago

Progress update:

switch to drm_sched (done)
lima_draw refine (done)
kernel non-contiguous mem MM (done)

Now the kernel support both contiguous and non-contiguous mem alloc in lima driver. Although I plan to discard contiguous mem support in lima and use the etnaviv way, but it's not too complicated to have them both now. We can remove the contiguous mem support latter when @enunes finish the renderonly lib change.

Now I just make all display DRM winsys request a contiguous scanout buffer, but I know some display DRM support non-contiguous scanout buffer. So guys with these SoC may try to change the 'contiguous_scanout' arguement of lima_drm_screen_create_renderonly(). If that works, I'll happy to see a merge request for it.

yuq / mesa-lima

plan about my lima dev #29