Open Vir-BryanQ opened 2 years ago
Ooh, cross-thread unmap refcount mismatch… fun. Sounds like two threads probably tried to unmap the same page simultaneously and some locking or other synchronization is missing, or something neglected to reference the right page directory…
@Vir-BryanQ Can you provide your build of ffmpeg and the test app? I would like to perform some interactive debugging with them.
Sure.
You can find the port of ffmpeg in http://q3z8400525.oicp.vip:25587/ffmpeg.tar.gz. I have opened this link agian. Just use these commands to install ffmpeg in toaruos:
About the test app, I have changed its source code and it may be difficult to find a built one. But I have shown the same source code in my first commend of this issue 3 days ago. You can copy it and build it in toaruos. Just use these commands:
In fact, the kernel panic will not always happen and you may need to run it several times to make it happen. Though the video to play doesn't matter, I still give it below: http://q3z8400525.oicp.vip:25587/i.mp4
Thanks, I was able to install ffmpeg and build the video player, and was able to reproduce the kernel panic. This will be very helpful in tracking down the root cause.
Also, bugs aside, very nice to see ffmpeg running under my libc. I will spend some time, possibly after tracking down this page refcount issue, to improve the video player and get this all packaged.
By the way, I noticed a bug in the video player source you supplied; fixing it does not affect the kernel issue, but I thought I'd point it out:
GFX(ctx, x, y) = *((uint32_t *)frame + i);
This should reference frame->data
. Currently it's interpreting the frame header data as a few pixels, which causes everything to be shifted a bit, which is why your screenshot shows the scrollbar on the left. Also, this could be improved with a memcpy for each line. I see my original source, without support for decorations, was copying the entire buffer to the window (https://github.com/klange/toaru-vidplayer/blob/master/vidplayer.c#L103).
After some digging into tracebacks and memory state at the time of the panic, I believe this stems from some missing resource locking between threads when modifying page tables - both in the free()
case calling mmu_unmap_user
and in thread teardown later on. As mentioned in https://github.com/klange/toaruos/issues/263#issuecomment-1295772460, this is an area that needs a lot of improvement.
I pushed a small patch a week ago that should improve the stability of threaded applications unmapping pages, at least during runtime. There is a lot that needs to be fixed around thread cleanup when processes exit, still.
So I think the port of SDL1.2 https://github.com/klange/SDL may be incomplete?
You're right - we are missing the timer interfaces. I think a variety of the 'standard' Unix timer implementation should be usable and will take a look at getting it into the SDL builds.
Sometimes the toaruos will get stucked when the player is running (dead lock in kernel?).
Hm, if there's no crash log from the kernel, then most likely the compositor has failed in a strange way and the system is still running but with no GUI. If that is the case, a serial console (sudo getty /dev/ttyS0
) would still work if you had one running, otherwise it's a bit difficult to recover through a debugger.
I try to implement a very simple video player based on ffmpeg, so I do these things below:
After installing SDL1.2, I started to port ffmpeg within a docker container. Almost the same with building SDL, ffmpeg was ported successfully after fixing some issues. About building ffmpeg, we only need to fix these issues:
The ffmpeg version is 2.4.1 and this port can be found in http://q3z8400525.oicp.vip:25587/ffmpeg.tar.gz
include <libavformat/avformat.h>
include <libavcodec/avcodec.h>
include <libavutil/imgutils.h>
include <libavutil/mathematics.h>
include <libavutil/samplefmt.h>
include <libswscale/swscale.h>
include <toaru/menu.h>
include <toaru/yutani.h>
include <toaru/graphics.h>
include <toaru/spinlock.h>
include <toaru/list.h>
include <toaru/decorations.h>
define BUFFER_LEN 128
int should_exit = 0;
pthread_t player, decoder;
yutani_t yctx; yutani_window_t ywin; gfx_context_t * ctx;
int decor_width, decor_height, decor_top_height, decor_bottom_height, decor_left_width, decor_right_width; int width, height; int win_width, win_height;
AVFormatContext format_ctx; AVCodecContext codec_ctx; AVCodec * codec; int video_stream_index;
volatile void * volatile buffer[BUFFER_LEN] = {0}; volatile size_t read_ptr = 0; volatile size_t write_ptr = 0;
void sigint_handler(void) { should_exit = 1;
}
void buffer_read(void) { do { if (read_ptr != write_ptr) { volatile void out = buffer[read_ptr]; buffer[read_ptr] = 0; read_ptr = (read_ptr + 1) % BUFFER_LEN; return (void *)out; } sched_yield(); } while (1); }
void buffer_write(void * target) { do { if ((write_ptr >= read_ptr || write_ptr < read_ptr - 1) && !((write_ptr == BUFFER_LEN-1) && (read_ptr == 0))) { buffer[write_ptr] = target; write_ptr = (write_ptr + 1) % BUFFER_LEN; return; } sched_yield(); } while (1); }
typedef struct { size_t number; size_t pts; int width; int height; char data[]; } my_frame;
double tmp;
void player_thread(void garbage) { struct timeval tv; int64_t start_time = 0; gettimeofday(&tv, NULL); start_time = tv.tv_sec * 1000000 + tv.tv_usec;
}
my_frame death_packet = { 0, 0, 0 };
void decoder_thread(void arg) {
}
int main(int argc, char * argv[]) {
}
These code is very easy to understand. When it is running, there are three threads. One thead to decode the video, one thread to render frames on the screen and the MAIN thread is receiving message. When it first runs, everything works well. But if I use Ctrl+C or other methods to interrupt it, it will stop normally. When I try to run it again, the kernel will panic. Sometimes the kernel will panic at the third time or fourth time when I try to run the player.
Sceenshot:
Video to play: http://q3z8400525.oicp.vip:25587/i.mp4