klange / toaruos

A completely-from-scratch hobby operating system: bootloader, kernel, drivers, C library, and userspace including a composited graphical UI, dynamic linker, syntax-highlighting text editor, network stack, etc.
https://toaruos.org/
University of Illinois/NCSA Open Source License
6.13k stars 487 forks source link

Kernel panic in `mmu_unmap_user` in threaded app #265

Open Vir-BryanQ opened 2 years ago

Vir-BryanQ commented 2 years ago

I try to implement a very simple video player based on ffmpeg, so I do these things below:

  1. In order to port ffmpeg to toaruos, I compiled SDL1.2 with https://github.com/klange/SDL by myself and installed it into toaruos so that its headers and shared libraries can be available. All these things were done within a docker container using cross compilation.
  2. After installing SDL1.2, I started to port ffmpeg within a docker container. Almost the same with building SDL, ffmpeg was ported successfully after fixing some issues. About building ffmpeg, we only need to fix these issues:

    • a missing libm function ceilf() (I got it stubbed because one seemed to be used only in libavfilter)
    • pthread_cond_xxx functions need to be implemented (Just disable pthreads when building ffmpeg)
    • some macros are missing in inttypes.h (Very easy to fix)

    The ffmpeg version is 2.4.1 and this port can be found in http://q3z8400525.oicp.vip:25587/ffmpeg.tar.gz

    1. So ffmpeg was ported and I run these code below:
      
      #include <stdio.h>
      #include <stdlib.h>
      #include <stdint.h>
      #include <string.h>
      #include <sys/time.h>
      #include <sys/wait.h>
      #include <pthread.h>
      #include <sched.h>
      #include <signal.h>

    include <libavformat/avformat.h>

    include <libavcodec/avcodec.h>

    include <libavutil/imgutils.h>

    include <libavutil/mathematics.h>

    include <libavutil/samplefmt.h>

    include <libswscale/swscale.h>

    include <toaru/menu.h>

    include <toaru/yutani.h>

    include <toaru/graphics.h>

    include <toaru/spinlock.h>

    include <toaru/list.h>

    include <toaru/decorations.h>

    define BUFFER_LEN 128

    int should_exit = 0;

    pthread_t player, decoder;

    yutani_t yctx; yutani_window_t ywin; gfx_context_t * ctx;

    int decor_width, decor_height, decor_top_height, decor_bottom_height, decor_left_width, decor_right_width; int width, height; int win_width, win_height;

    AVFormatContext format_ctx; AVCodecContext codec_ctx; AVCodec * codec; int video_stream_index;

    volatile void * volatile buffer[BUFFER_LEN] = {0}; volatile size_t read_ptr = 0; volatile size_t write_ptr = 0;

    void sigint_handler(void) { should_exit = 1;

    pthread_join(player, NULL);
    pthread_join(decoder, NULL);
    
    exit(1);

    }

    void buffer_read(void) { do { if (read_ptr != write_ptr) { volatile void out = buffer[read_ptr]; buffer[read_ptr] = 0; read_ptr = (read_ptr + 1) % BUFFER_LEN; return (void *)out; } sched_yield(); } while (1); }

    void buffer_write(void * target) { do { if ((write_ptr >= read_ptr || write_ptr < read_ptr - 1) && !((write_ptr == BUFFER_LEN-1) && (read_ptr == 0))) { buffer[write_ptr] = target; write_ptr = (write_ptr + 1) % BUFFER_LEN; return; } sched_yield(); } while (1); }

    typedef struct { size_t number; size_t pts; int width; int height; char data[]; } my_frame;

    double tmp;

    void player_thread(void garbage) { struct timeval tv; int64_t start_time = 0; gettimeofday(&tv, NULL); start_time = tv.tv_sec * 1000000 + tv.tv_usec;

    while (!should_exit) 
    {
          my_frame * frame = buffer_read();
    
          if (!frame) 
          {
                fprintf(stderr, "Something is wrong, frame was zero. Bail.\n");
                break;
          }
    
          if (frame->width == 0 && frame->height == 0) break;
    
          printf("\rFrame [%lld]", frame->number);
          printf(" pts: %lld    ", frame->pts);
          fflush(stdout);
    
          int64_t new_time = 0;
          gettimeofday(&tv, NULL);
          new_time = tv.tv_sec * 1000000 + tv.tv_usec;
    
          while (new_time - start_time < frame->pts * tmp) 
          {
                if (frame->pts * tmp - (new_time - start_time) > 2000) 
                {
                      sched_yield();
                }
                gettimeofday(&tv, NULL);
                new_time = tv.tv_sec * 1000000 + tv.tv_usec;
          }
    
          int i = 0;
          for (int y = decor_top_height; y < decor_top_height + height; ++y)
          {
                for (int x = decor_left_width; x < decor_left_width + width; ++x)
                {
                      GFX(ctx, x, y) = *((uint32_t *)frame + i);
                      ++i;
                }
          }
          render_decorations(ywin, ctx, "VidPlayer");
    
          free(frame);
    
          yutani_flip(yctx, ywin);
    }
    
    return NULL;

    }

    my_frame death_packet = { 0, 0, 0 };

    void decoder_thread(void arg) {

    AVPacket packet;
    AVFrame * frame;
    int framedone;
    
    struct SwsContext * swctx;
    
    frame = av_frame_alloc();
    
    if (!frame) 
    {
          fprintf(stderr, "frak, out of memz\n");
    }
    
    fprintf(stderr, "Width = %d, Height = %d, converting from format #%d...\n", frame->width, frame->height, frame->format);
    
    swctx = sws_getContext(width, height, codec_ctx->pix_fmt, width, height, AV_PIX_FMT_RGB32, 0, 0, 0, 0);
    
    uint8_t *dst_data[4];
    int dst_linesize[4];
    av_image_alloc(dst_data, dst_linesize, width, height, AV_PIX_FMT_RGB32, 1);
    
    tmp = (double)format_ctx->streams[video_stream_index]->time_base.num / (double)format_ctx->streams[video_stream_index]->time_base.den * 1000000;
    
    int i = 0;
    
    while (!should_exit && av_read_frame(format_ctx, &packet) >= 0) 
    {
          if (packet.stream_index == video_stream_index) 
          {
                avcodec_decode_video2(codec_ctx, frame, &framedone, &packet);
    
                if (framedone) 
                {
                      i++;
    
                      sws_scale(swctx, (const uint8_t * const *)frame->data, frame->linesize, 0, frame->height, dst_data, dst_linesize);
    
                      my_frame * f = malloc(sizeof(my_frame) + width * height * 4);
                      f->number = frame->coded_picture_number;
                      f->pts = frame->pkt_pts;
                      f->width = width;
                      f->height = height;
                      memcpy(&f->data, dst_data[0], width * height * 4);
                      buffer_write(f);
    
                }
          }
          av_free_packet(&packet);
    }
    
    buffer_write(&death_packet);
    
    av_free(frame);
    
    return NULL;

    }

    int main(int argc, char * argv[]) {

    av_register_all();
    
    format_ctx = avformat_alloc_context();
    
    if (avformat_open_input(&format_ctx, argv[1], 0, NULL)) 
    {
          return 1;
    }
    
    if (avformat_find_stream_info(format_ctx, NULL) < 0) 
    {
          return 2;
    }
    
    av_dump_format(format_ctx, 0, argv[1], 0);
    
    video_stream_index = -1;
    for (int i = 0; i < format_ctx->nb_streams ; ++i) 
    {
          if (format_ctx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) 
          {
                video_stream_index = i;
                break;
          }
    }
    
    if (video_stream_index < 0) 
    {
          return 3;
    }
    
    codec_ctx = format_ctx->streams[video_stream_index]->codec;
    codec = avcodec_find_decoder(codec_ctx->codec_id);
    
    int r = avcodec_open2(codec_ctx, codec, NULL);
    
    if (!codec) 
    {
          return 4;
    }
    
    width = codec_ctx->width;
    height = codec_ctx->height;
    
    yctx = yutani_init();
    
    init_decorations();
    
    struct decor_bounds bounds;
    decor_get_bounds(ywin, &bounds);
    decor_top_height = bounds.top_height;
    decor_bottom_height = bounds.bottom_height;
    decor_left_width = bounds.left_width;
    decor_right_width = bounds.right_width;
    decor_width = bounds.width;
    decor_height = bounds.height;
    
    win_height = bounds.height + height;
    win_width = bounds.width + width;
    
    ywin = yutani_window_create(yctx, win_width, win_height);
    yutani_window_advertise_icon(yctx, ywin, "VidPlayer", "plasma");  // just use the icon of plasma 
    
    ctx = init_graphics_yutani(ywin);
    
    draw_fill(ctx, rgb(127, 0, 127));
    render_decorations(ywin, ctx, "VidPlayer");
    yutani_flip(yctx, ywin);
    
    pthread_create(&decoder, NULL, decoder_thread, NULL);
    pthread_create(&player, NULL, player_thread, NULL);
    
    signal(SIGINT, (void (*)(int))sigint_handler);
    while (!should_exit)
    {
          yutani_msg_t * m = yutani_poll(yctx);
          while (m)
          {
                menu_process_event(yctx, m);
                switch (m->type)
                {
                      case YUTANI_MSG_KEY_EVENT:
                      {
                          struct yutani_msg_key_event * ke = (void*)m->data;
                          if (ke->event.action == KEY_ACTION_DOWN && ke->event.keycode == 'q') 
                          {
                              should_exit = 1;
                          }
                          break;
                      }
                      case YUTANI_MSG_WINDOW_MOUSE_EVENT:
                      {
                          struct yutani_msg_window_mouse_event * me = (void*)m->data;
                          switch (decor_handle_event(yctx, m)) 
                          {
                              case DECOR_CLOSE:
                                  should_exit = 1;
                                  break;
                              case DECOR_RIGHT:
                                  decor_show_default_menu(ywin, ywin->x + me->new_x, ywin->y + me->new_y);
                                  break;
                          }
                          break;
                      }
    
                      case YUTANI_MSG_WINDOW_CLOSE:
                      case YUTANI_MSG_SESSION_END:
                      {
                          should_exit = 1;
                          break;
                      }
                }
    
                free(m);
                m = yutani_poll_async(yctx);
          }
    }
    
    avcodec_close(codec_ctx);
    avformat_close_input(&format_ctx);
    
    pthread_join(player, NULL);
    pthread_join(decoder, NULL);
    
    return 0;

    }

These code is very easy to understand. When it is running, there are three threads. One thead to decode the video, one thread to render frames on the screen and the MAIN thread is receiving message. When it first runs, everything works well. But if I use Ctrl+C or other methods to interrupt it, it will stop normally. When I try to run it again, the kernel will panic. Sometimes the kernel will panic at the third time or fourth time when I try to run the player.

Sceenshot: panic

Video to play: http://q3z8400525.oicp.vip:25587/i.mp4

klange commented 2 years ago

Ooh, cross-thread unmap refcount mismatch… fun. Sounds like two threads probably tried to unmap the same page simultaneously and some locking or other synchronization is missing, or something neglected to reference the right page directory…

klange commented 2 years ago

@Vir-BryanQ Can you provide your build of ffmpeg and the test app? I would like to perform some interactive debugging with them.

Vir-BryanQ commented 2 years ago

Sure.

  1. You can find the port of ffmpeg in http://q3z8400525.oicp.vip:25587/ffmpeg.tar.gz. I have opened this link agian. Just use these commands to install ffmpeg in toaruos:

  2. About the test app, I have changed its source code and it may be difficult to find a built one. But I have shown the same source code in my first commend of this issue 3 days ago. You can copy it and build it in toaruos. Just use these commands:

    • gcc -o test test.c -ltoaru_graphics -ltoaru_yutani -ltoaru_list -ltoaru_decorations -ltoaru_menu -lavformat -lavcodec -lswscale -lavutil
  3. In fact, the kernel panic will not always happen and you may need to run it several times to make it happen. Though the video to play doesn't matter, I still give it below: http://q3z8400525.oicp.vip:25587/i.mp4

klange commented 2 years ago

Thanks, I was able to install ffmpeg and build the video player, and was able to reproduce the kernel panic. This will be very helpful in tracking down the root cause.

Screenshot from 2022-11-02 12-37-05

Also, bugs aside, very nice to see ffmpeg running under my libc. I will spend some time, possibly after tracking down this page refcount issue, to improve the video player and get this all packaged.

klange commented 2 years ago

By the way, I noticed a bug in the video player source you supplied; fixing it does not affect the kernel issue, but I thought I'd point it out:

                         GFX(ctx, x, y) = *((uint32_t *)frame + i);

This should reference frame->data. Currently it's interpreting the frame header data as a few pixels, which causes everything to be shifted a bit, which is why your screenshot shows the scrollbar on the left. Also, this could be improved with a memcpy for each line. I see my original source, without support for decorations, was copying the entire buffer to the window (https://github.com/klange/toaru-vidplayer/blob/master/vidplayer.c#L103).

klange commented 2 years ago

After some digging into tracebacks and memory state at the time of the panic, I believe this stems from some missing resource locking between threads when modifying page tables - both in the free() case calling mmu_unmap_user and in thread teardown later on. As mentioned in https://github.com/klange/toaruos/issues/263#issuecomment-1295772460, this is an area that needs a lot of improvement.

Vir-BryanQ commented 2 years ago
  1. Thanks for pointing out this bug in the video player. I have noticed it and fixed it two days ago. As you can see, this video player is weak and it doesn't even support audio and I have rewritten it two days ago. The new player has five threads: demuxer, audio decoder, audio player, video decoder, video player. It support almost all generic audio format and video format as it's based on ffmpeg. But it's still too weak to be a player as I haven't found a good way to implement the synchronization between video and audio. new player: http://q3z8400525.oicp.vip:25587/test.c
  2. Abount the port of ffmpeg, ffmpeg has worked well since it's ported to toaruos but it's not perfect. As you can see, in order to build it, I got libm function ceilf() stubbed and I don't know what potantial problems it will cause. Besides, this port doesn't support pthreads as pthread_cond_xxx functions are missing.
  3. About the timer support in SDL1.2, this port has a powerful player called ffplay in ffmpeg/bin and it can't work under toaruos. The ffplay is based on SDL and I get an error "SDL is not built with timer support" when I try to run it. So I think the port of SDL1.2 https://github.com/klange/SDL may be incomplete? (That's why I don't use SDL to implement the video player)
  4. After I rewrote the video player, it seems that the new player won't cause the kernel panic. However, there is a new problem. Sometimes the toaruos will get stucked when the player is running (dead lock in kernel?). The mouse and keyboard doesn't work at all and I can do nothing and have to reboot the toaruos.
klange commented 2 years ago

I pushed a small patch a week ago that should improve the stability of threaded applications unmapping pages, at least during runtime. There is a lot that needs to be fixed around thread cleanup when processes exit, still.

So I think the port of SDL1.2 https://github.com/klange/SDL may be incomplete?

You're right - we are missing the timer interfaces. I think a variety of the 'standard' Unix timer implementation should be usable and will take a look at getting it into the SDL builds.

Sometimes the toaruos will get stucked when the player is running (dead lock in kernel?).

Hm, if there's no crash log from the kernel, then most likely the compositor has failed in a strange way and the system is still running but with no GUI. If that is the case, a serial console (sudo getty /dev/ttyS0) would still work if you had one running, otherwise it's a bit difficult to recover through a debugger.