TabuaTambalam / DalleWebms

14 stars 0 forks source link

Fixing output errors when use_vulkan_compute=True #1

Open TabuaTambalam opened 2 years ago

TabuaTambalam commented 2 years ago
  1. net.opt.use_vulkan_compute=True/False has no effect at all: Solved, use pip install ncnn-vulkan

  2. python core crash after extract, message Unknown error occurred when clearing extractor: Solved, use ex= net.create_extractor() instead of with net.create_extractor() as ex: (will there be a vram leak?) edit: vram leak confirmed, relates to reclaim_blob_allocator(), contacting to the maker of ncnn-vulkan thereafter. edit2: Made a ExtractorGPU for vulkan route, vram leak solved.

  3. 256x256 output is blank: Solved, by adding:

    net.opt.use_fp16_packed = False
    net.opt.use_fp16_storage = False
    net.opt.use_fp16_arithmetic = False
  4. 256x(16n) output is blank: Unsolved, Output at node 35 (BinaryOp-Add, node16+node34) has a significant diff between cpu and vulkan when the input is a 512 (16x32) seq. A lot of its values became negative when on vulkan.

TabuaTambalam commented 2 years ago

UseVulkan now works without vram leak, but still can't do 256x(16n) decode. Use Tools -> Reload ncnndec, tick UseVulkan if you really want gpu accel on ncnn vqgan decode. ( may slow down InfiniteSimilarGen if it's running in background. )

====old message====

I avoid that Unknown error occurred when clearing extractor! by patching /src/net.cpp like this:

void Extractor::clear()
{
    d->blob_mats.clear();

#if NCNN_VULKAN
    if (d->opt.use_vulkan_compute)
    {
        d->blob_mats_gpu.clear();
        d->blob_mats_gpu_image.clear();
/*
        if (d->local_blob_vkallocator)
        {
            d->net->vulkan_device()->reclaim_blob_allocator(d->local_blob_vkallocator);
        }
        if (d->local_staging_vkallocator)
        {
            d->net->vulkan_device()->reclaim_staging_allocator(d->local_staging_vkallocator);
        }
*/
    }
#endif // NCNN_VULKAN
}

Now it's compatible with old with net.create_extractor() as ex: route, and yes vram will leak just like before, it will take up to 8gb vram after doing 9~10 256x256 vqgan decode.

How do you think about a proper way report to the ncnn upstream? (you are one of the team?)

The test script: https://github.com/TabuaTambalam/DalleWebms/blob/main/docs/debugging/ncnndec.py prerequisite setup script:

!apt-get install -y libvulkan-dev libomp5
!wget https://github.com/TabuaTambalam/vqqncnn/releases/download/0.2/ncnn-1.0.20220729-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
!pip install /content/ncnn-1.0.20220729-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
!wget -O /tmp/vq.param https://raw.githubusercontent.com/TabuaTambalam/vqqncnn/main/vq.param
!wget -O /tmp/vq.bin https://github.com/TabuaTambalam/vqqncnn/releases/download/0.0/vq.bin
!wget https://github.com/TabuaTambalam/DalleWebms/releases/download/0.1/ozv.bin
joeyballentine commented 2 years ago

I am not on the ncnn team. I think you should just make an issue on the ncnn GitHub repository. Cheers.