katepanping / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Input parameters of the call function will be overwrite #418

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
I call the function ConvertFromI420 in Func like the following

Func(void *a, void *b, double c, double d, int64_t e, int f){
   ConvertFromI420(....)
}

The FOURCC_RGBP is used in ConvertFromI420. I turn on the NEON optimization.
and after calling ConvertFromI420. The variables "c", and "d" of Func will be 
reset to zero. I check the problem and I found "vld1.16    {d8[], d9[]}, 
[%[kUVBiasBGR]]!" in 
YUV422TORGB_SETUP_REG will do this. If I marked it , the variables "c" and "d" 
will not be reset, although the result is wrong. Please help to solve it. 
Thanks!

What is the expected output? What do you see instead?
The variables of Func will not be changed.

What version of the product are you using? On what operating system?
r1348
Android

Please provide any additional information below.
In older version of libyuv which do not support RGB565 dither, it is ok. It 
only happen in the new version which include RBG565 dither function. 

Original issue reported on code.google.com by felixyan...@gmail.com on 30 Mar 2015 at 5:49

GoogleCodeExporter commented 9 years ago
According to aarch64 ABI
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
v8-v15 need to be preserved.

but for 32 bit on android I dont think any vectors need preserving?

Original comment by fbarch...@chromium.org on 31 Mar 2015 at 11:15

GoogleCodeExporter commented 9 years ago
I tried the updated version again. The situation exists! I am sorry I am not 
familiar with the NEON instruction. Would you tell me how to save the register? 
I can do the experiment. Thanks.

Original comment by felixyan...@gmail.com on 1 Apr 2015 at 2:14

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
For ARM64, v8-v15 need to be preserved. You can use stp/ldp instructions 
For ARM32, d8-d15 need to be preserved. You can use vpush/vpop instructions.

Original comment by yang.zh...@arm.com on 1 Apr 2015 at 3:55

GoogleCodeExporter commented 9 years ago
It's ok to preserving d8-d15. I modified the code as following, and it's ok.
However, I don't know if the position of "vpop {d8-d15} is ok because it is 
after bgt b1. I tried to put it before bgt b1 but the program crashed. Please 
help to confirm where to insert vpush and vpop for d8-d15. Thanks.

void I422ToRGB565Row_NEON(const uint8* src_y,
                          const uint8* src_u,
                          const uint8* src_v,
                          uint8* dst_rgb565,
                          int width) {
  asm volatile (
    "vpush      {d8-d15}        \n"
    YUV422TORGB_SETUP_REG
    ".p2align   2                              \n"
  "1:                                          \n"
    READYUV422
    YUV422TORGB
    "subs       %4, %4, #8                     \n"
    ARGBTORGB565
    MEMACCESS(3)
    "vst1.8     {q0}, [%3]!                    \n"  // store 8 pixels RGB565.
    "bgt        1b                             \n"
    "vpop       {d8-d15}        \n"
    : "+r"(src_y),    // %0
      "+r"(src_u),    // %1
      "+r"(src_v),    // %2
      "+r"(dst_rgb565),  // %3
      "+r"(width)     // %4
    : [kUVToRB]"r"(&kUVToRB),   // %5
      [kUVToG]"r"(&kUVToG),     // %6
      [kUVBiasBGR]"r"(&kUVBiasBGR),
      [kYToRgb]"r"(&kYToRgb)
    : "cc", "memory", "q0", "q1", "q2", "q3",
      "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15"
  );
}

Original comment by felixyan...@gmail.com on 1 Apr 2015 at 6:20

GoogleCodeExporter commented 9 years ago
Theoretically, for inline assembly, you needn't save d8-d15 manually. When you 
use these registers, the compiler will help to save them automatically. Only in 
a seperated assembly file, you need to save d8-d15 manually just like what you 
are using.

I have found why this issue is generated. I will fix it. Before I fix it, you 
can work arround as follow:
: "cc", "memory", "q0", "q1", "q2", "q3", "q4",--------------------------> "q4" 
is added
      "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15"

Original comment by yang.zh...@arm.com on 1 Apr 2015 at 6:55

GoogleCodeExporter commented 9 years ago
Thanks. I used your suggestion. It is work!

Original comment by felixyan...@gmail.com on 1 Apr 2015 at 7:11

GoogleCodeExporter commented 9 years ago
fixed in r1353

Original comment by fbarch...@chromium.org on 2 Apr 2015 at 7:58