mayingzhen / nvidia-texture-tools

Automatically exported from code.google.com/p/nvidia-texture-tools
Other
0 stars 0 forks source link

CUDA technology does not work with the following code (texture stretch) #163

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1.Running code using library
2.
3.

What is the expected output? What do you see instead?
My stretch code works 10 times faster without using your library.
I have took a look at library code. Stretch function seems had been written to 
do not use CUDA technology. If it is wrong, then CUDA technology is too poor.
I need to increase my function in 4 times. Can you help me?

What version of the product are you using? On what operating system?
nvidia-texture-tools-2.0.8-1.tar.gz
Windows 7 32

Please provide any additional information below.
Code using library:

    nv::AutoPtr<nv::Filter> filter(new nv::BoxFilter());

        nv::Image image;
        image.setFormat(nv::Image::Format_ARGB);
        image.wrap((void*)local_source_texture_memory, cxImage, cyImage);

        nv::FloatImage fimage(&image);
        nv::AutoPtr<nv::FloatImage> fresult
            (
            fimage.resize
            (
            *filter,
            (local_target_rect_final_bufer.right-local_target_rect_final_bufer.left),
            (local_target_rect_final_bufer.bottom-local_target_rect_final_bufer.top),
            nv::FloatImage::WrapMode_Clamp
            )
            );

        image.unwrap();

        for(int local_counter_line=0;local_counter_line<(local_target_rect_final_bufer.bottom-local_target_rect_final_bufer.top);local_counter_line++)
        {
            float *local_line_data = fresult->scanline(local_counter_line,0);
            {
                for(int local_counter_row=0;local_counter_row<(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left);local_counter_row++)
                {
                    local_final_texture_memory_DATA_1
                        [
                            local_counter_row+local_counter_line*(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left)
                        ].rgbBlue
                        = 
                        255*local_line_data[local_counter_row];
                }
            }
            local_line_data = fresult->scanline(local_counter_line,1);
            {
                for(int local_counter_row=0;local_counter_row<(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left);local_counter_row++)
                {
                    local_final_texture_memory_DATA_1
                        [
                            local_counter_row+local_counter_line*(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left)
                        ].rgbGreen
                        = 
                        255*local_line_data[local_counter_row];
                }
            }
            local_line_data = fresult->scanline(local_counter_line,2);
            {
                for(int local_counter_row=0;local_counter_row<(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left);local_counter_row++)
                {
                    local_final_texture_memory_DATA_1
                        [
                            local_counter_row+local_counter_line*(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left)
                        ].rgbRed
                        = 
                        255*local_line_data[local_counter_row];
                }
            }
            local_line_data = fresult->scanline(local_counter_line,3);
            {
                for(int local_counter_row=0;local_counter_row<(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left);local_counter_row++)
                {
                    local_final_texture_memory_DATA_1
                        [
                            local_counter_row+local_counter_line*(local_target_rect_final_bufer.right-local_target_rect_final_bufer.left)
                        ].rgbReserved
                        = 
                        255*local_line_data[local_counter_row];
                }
            }
        }

Stretch function code:
void stretch_memory_R8G8B8A8(int source_height,int source_width,void 
*source_memory,int destination_height,int destination_width,void 
*destination_memory)
{
    BYTE *destination_memory_BYTE = (BYTE *)destination_memory;
    BYTE *source_memory_BYTE = (BYTE *)source_memory;

    const double local_initial_value_width_0 = double(destination_width)/double(source_width);

    for(int local_height_counter_source=0;local_height_counter_source<source_height;local_height_counter_source++)
    {
        int local_check_value_height = local_height_counter_source*destination_height/source_height+1+destination_height/source_height;
        if(local_check_value_height>destination_height)
        {
            local_check_value_height = destination_height;
        }
        for(int local_height_counter_destination=local_height_counter_source*destination_height/source_height;
            local_height_counter_destination<local_check_value_height;
            local_height_counter_destination++)
        {
            for(int local_width_counter_source=0;local_width_counter_source<source_width;local_width_counter_source++)
            {
                int local_initial_value_width = local_width_counter_source*local_initial_value_width_0;
                int local_check_value_width = local_initial_value_width+1+local_initial_value_width_0;
                if(local_check_value_width>destination_width)
                {
                    local_check_value_width = destination_width;
                }
                for(int local_width_counter_destination=local_initial_value_width;
                    local_width_counter_destination<local_check_value_width;
                    local_width_counter_destination++)
                {                           
                    *(DWORD*)(&destination_memory_BYTE[(local_width_counter_destination+local_height_counter_destination*destination_width)<<2]) =
                        *(DWORD*)(&source_memory_BYTE[(local_width_counter_source+local_height_counter_source*source_width)<<2]);
                }
            }
        }
    }
}

Original issue reported on code.google.com by Kozlov.S...@gmail.com on 1 May 2011 at 1:54

GoogleCodeExporter commented 8 years ago

Original comment by cast...@gmail.com on 13 Sep 2011 at 5:50