ps3dev / PSL1GHT

A lightweight PS3 SDK
www.psl1ght.com
MIT License
225 stars 64 forks source link

rsx: commands.c #92

Closed crystalct closed 4 years ago

crystalct commented 4 years ago
s32 __attribute__((noinline)) rsxContextCallback(gcmContextData *context,u32 count)
{
    register s32 result asm("3");
    asm volatile (
        "stdu   1,-128(1)\n"
        "mr     31,2\n"
        "lwz    0,0(%0)\n"
        "lwz    2,4(%0)\n"
        "mtctr  0\n"
        "bctrl\n"
        "mr     2,31\n"
        "addi   1,1,128\n"
        : : "b"(context->callback)
        : "r31", "r0", "lr"
    );
    return result;
}

It seems no work on real PS3 hardware, gcc 7.2.0 compile it but when ps3 use it, there is a crash. rsxtest and rsxtest_spu samples don't work, maybe there is a correlation.

crystalct commented 4 years ago

MMM... in a fresh installation it's different.... more investigaion

zeldin commented 4 years ago

I think this would be better put in a separate .S file. Putting flow control instructions inside asm() is never a good idea...

shagkur commented 4 years ago

I wonder how the resulting assembly code looks like, for both, gcc 4.x and gcc 7.2.0 Inline assembly is a feature, perfectly fine to be used. Perhaps the spec about register usage by the compiler has slightly changed? https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html

zeldin commented 4 years ago

Well, gcc tends to screw things up if you put global control transfer in there without telling it (and there is no way to tell it)...

shagkur commented 4 years ago

Thatβ€˜s why you have to analyze the resulting assembly code sometimes. This way you can tweak it a bit. Hence iβ€˜m asking for the resulting assembly.

zeldin commented 4 years ago

Except if you put the code in an .S instead you don't have to analyze and tweak for each gcc version (possibly breaking compat with earlier gcc versions in the process)... :wink: gcc doesn't really bring anything to the table here -- the function is all asm...

crystalct commented 4 years ago

@miigotu @zeldin @shagkur @xerpi @wargio @bucanero We need help to understand why Tiny3D doesn't work (black screen) on real PS3 hardware compiled with gcc 7 Target: wargio/tiny3d What i tested: all tiny3d samples works on RPCS3 emulator compiled with gcc 7;; all RSX graphics samples , cairo samples and NoRSX sample works on PS3 compiled with gcc 7; tiny3d compiled with gcc 4.x works on PS3; rsxContextCallback and clobbered registries isn't the problem, inside libRSX sources that function exists and works; inline functions isn't the problem, tested rewritting inline functions as standard function. It doesn't work;

Any ideas?

miigotu commented 4 years ago

I don't even own a PS3 anymore lol, so I can't test for you.

zeldin commented 4 years ago

@crystalct In order to reduce the number of variables, could you try to find out if it's compiling PSL1GHT, tiny3d, or the sample program with gcc7 that breaks things? i.e. compile everything with gcc 4 and then only compile one of the three with gcc 7 and link the result to test it. Ideally we can eventually narrow it down to single .c file that breaks things when compiled with gcc7. Then we can compare the disassemblies to figure out what is going on, as shagkur was saying earlier.

wargio commented 4 years ago

Unfortunately I can't test either because all my ps3s are 500 km away from me

zeldin commented 4 years ago

Hm, when creating a common gcc-PS3 repo with the different gcc versions, I noticed that 7.2.0 is missing the file gcc/config/rs6000/t-cell64lv2. I'm wondering if this could cause programs using floating point math (which I assume that tiny3d is doing) to malfunction?

bucanero commented 4 years ago

@crystalct In order to reduce the number of variables, could you try to find out if it's compiling PSL1GHT, tiny3d, or the sample program with gcc7 that breaks things? i.e. compile everything with gcc 4 and then only compile one of the three with gcc 7 and link the result to test it. Ideally we can eventually narrow it down to single .c file that breaks things when compiled with gcc7. Then we can compare the disassemblies to figure out what is going on, as shagkur was saying earlier.

just a small note, I have been using the PSL1GHT gcc7 build (toolchain+psl1ght) for my homebrews with no issues, along with the gcc4 build of Tiny3D (libtiny3d.a). My environment is running on macOS. So, at least on my experience:

I don't have the RPCS3 emulator, but I can confirm that if I build tiny3d with gcc7, the same sample app only shows a black screen (on PS3).

zeldin commented 4 years ago

@bucanero Thank you, this is very valuable information. Then we know that the problem is with creating one of the object files that goes into libtiny3d.a, and not in compiling e.g. librsx.a or libgcc.a.

bucanero commented 4 years ago

An additional reference, the gcc4 libtiny3d.a I'm using is this one, from Estwald's repo: https://github.com/Estwald/PSDK3v2/blob/master/libraries-src/Tiny3D/lib/libtiny3d.a

As @crystalct mentioned in another message, if you try to build Estwald's source with gcc7, first it will complain about "clobbered register r2". If you resolve that, I think it complains about some inline functions definitions. Crystal fixed those issues in his fork, but the build doesn't behave as expected. What would be the best approach to narrow down the issue to some file/function in tiny3d for further analysis? (I usually add printf's() everywhere, but might not be the most professional way πŸ˜„ )

shagkur commented 4 years ago

Snce i havent touched PS3 dev in ages iβ€˜ll first have to build the toolchain. I also have an old VM with my build setup for gcc 4.5. Iβ€˜ll set things in the next days and give it a go

DamiΓ‘n Parrino notifications@github.com schrieb am So. 28. Juni 2020 um 15:16:

An additional reference, the gcc4 libtiny3d.a I'm using is this one, from Estwald's repo:

https://github.com/Estwald/PSDK3v2/blob/master/libraries-src/Tiny3D/lib/libtiny3d.a

As @crystalct https://github.com/crystalct mentioned in another message, if you try to build Estwald's source with gcc7, first it will complain about "clobbered register r2". If you resolve that, I think it complains about some inline functions definitions. Crystal fixed those issues in his fork, but the build doesn't behave as expected. What would be the best approach to narrow down the issue to some file/function in tiny3d for further analysis?

β€” You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/ps3dev/PSL1GHT/issues/92#issuecomment-650755857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADQD7UZDOPZKOMYBE4LVZ3RY47DXANCNFSM4OJ7F5JQ .

crystalct commented 4 years ago

Taken Tiny3D from Estwald/PSDK3v2 and compiled with gcc7 -> libtiny3d.a used as library Written a simply test :

s32 main(s32 argc, const char* argv[])
{
    padInfo padinfo;
    padData paddata;
    int i;
    tiny3d_Init(1024*1024);
    ioPadInit(7);
    while(1) {
        tiny3d_Clear(0xff000000, TINY3D_CLEAR_ALL);
        tiny3d_AlphaTest(1, 0x10, TINY3D_ALPHA_FUNC_GEQUAL);
        tiny3d_BlendFunc(1, TINY3D_BLEND_FUNC_SRC_RGB_SRC_ALPHA | TINY3D_BLEND_FUNC_SRC_ALPHA_SRC_ALPHA,
            TINY3D_BLEND_FUNC_DST_RGB_ONE_MINUS_SRC_ALPHA | TINY3D_BLEND_FUNC_DST_ALPHA_ZERO,
            TINY3D_BLEND_RGB_FUNC_ADD | TINY3D_BLEND_ALPHA_FUNC_ADD);
        ioPadGetInfo(&padinfo);
        for(i = 0; i < MAX_PADS; i++){
            if(padinfo.status[i]){
                ioPadGetData(i, &paddata);
                if(paddata.BTN_CROSS){
                    return 0;
                }
            }
        }
        tiny3d_Project2D();
        tiny3d_Flip();
    }
    return 0;
}

Worked.... in wait of button X and then to XBM. After tiny3d_Project2D() and before flip, added: tiny3d_SetPolygon(TINY3D_QUADS);

tiny3d_VertexPos(0 , 0 , 65535); tiny3d_VertexColor(0x0040ffff); // light blue

tiny3d_VertexPos(847, 0 , 65535);

tiny3d_VertexPos(847, 511, 65535);

tiny3d_VertexPos(0 , 511, 65535);

tiny3d_End();

PS3 in crash

Same library, compiled with gcc4 ->libtiny3d.a used (so gcc4 objects). Added lines to fill the screen blue; Easy sample compiled with gcc 7 and tiny3d lib gcc 4, worked. Blue screen and wait for X button; compiled library with gcc 7 to have objects stored in a folder. with ppu-ar listed working library (gcc4):

buffer.o
commands.o
glue.o
matrix.o
mm.o
realityVP.o
rsxutil.o
tiny3d.o
vshader_text_normal.vcg.o

Replaced one by one, using ppu-ar d & r commands, with gcc 7 objects. Every replace, a new test was made. All working, except last one: tiny3d.o

Now i know where is the problem.... good night, for now.

miigotu commented 4 years ago

Is it tin3d_VertexPos, tin3d_VertexColor, or tin3d_end that causes the hang? That might narrow you down even more if you just try one method at a time.

shagkur commented 4 years ago

Today i built the toolchain from scratch (well failed on some 3rd party ps3libraries tho), and had a look at the resulting assembly in commands.c To me this looks all fine. Shouldn't cause any issues with the registers nor the memory (although we're entering the PRX part when calling the callback, iirc). Anyways. Can anyone of you confirm that the psl1ght provided rsx examples work on real HW when built with gcc 7.2.0? @zeldin The inline assembly is still perfectly fine. Your change to remove r1 & r2 from the clobber list was perfectly right. Although gcc only got more restrictive regarding r2 (TOC/PIC) register. But on this occasion we need to modify r2 with the TOC provided from OS/PRX. I probably have to dig out my old debug PS3 from the cellar :D

crystalct commented 4 years ago

Yes, all samples from ps3dev/PSL1GHT eare working.

Here you can read my consideration about compiling toolchain gcc 7.2.0: https://www.psx-place.com/threads/compiling-open-source-ps3-toolchain-nowadays.30030/

shagkur commented 4 years ago

Okay. that's at least positive :) Btw. can you have a look at this one:

#!/usr/bin/env bash
# NVIDIA cg.dll installer script by CrystalCT (crystal@unict.it)

## Uname string
UNAME=$(uname -a)

if [[ "$UNAME" =~ (CYGWIN|MINGW) ]];
then 
    ## Download a fresh cg.dll
    rm -f cg.zip;
    rm -f cg.dll;
    if [[ "$UNAME" =~ (x86_64|MINGW64) ]];
    then
        wget --continue --no-check-certificate https://wikidll.com/download/3726/cg.zip;
    else
        wget --continue --no-check-certificate https://wikidll.com/download/3730/cg.zip;
    fi

    ## Install cg.dll
    echo "Installing cg.dll in $PS3DEV/bin"
    unzip cg.zip
    rm -f cg.zip
    CGHASH=$(rhash -C cg.dll -p "%{crc32}") 
    if [[ "$CGHASH" =~ (28f6073b|e5228ec2) ]];
    then
        chmod 766 cg.dll
        mv -f cg.dll $PS3DEV/bin;
    else
        echo "Error - cg.dll: wrong CRC32";
    fi
fi

Your version did not work for me on linux (regex does not work in POSIX shell). If it's fine for your i'd commit it.

shagkur commented 4 years ago

So the persisting issue is tiny3d within the emu(s) and on real HW?

bucanero commented 4 years ago

Anyways. Can anyone of you confirm that the psl1ght provided rsx examples work on real HW when built with gcc 7.2.0?

A few days ago (after @miigotu merged #75 ) I was able to build and run the rsxtest example with the gcc7.2 toolchain. I ran the rsxtest in real PS3 hardware, and it worked fine. On the other hand, rsxtest_spu didn't work. (black screen)

btw, I see that you committed some changes to rsxtest, if you need I can test again with my PS3.

So the persisting issue is tiny3d within the emu(s) and on real HW?

From what I understand, the Tiny3d issue is only present on real HW. (it works on RPCS3 emulator). @crystalct should confirm since he has a working emulator environment.

bucanero commented 4 years ago
`#!/usr/bin/env bash

# NVIDIA cg.dll installer script by CrystalCT ([crystal@unict.it](mailto:crystal@unict.it))
## Uname string
UNAME=$(uname -a)
...

Your version did not work for me on linux (regex does not work in POSIX shell). If it's fine for your i'd commit it.

I think this script only makes sense for Windows users to install the cg.dll, so perhaps it might be better to leave it out of the toolchain build workflow? I mean: keep the script, but don't auto-execute it during the normal toolchain build. Also, since it's downloading the .dll files from a third-party website wikidll.com (not from nvidia.com) , some users might be concerned about security issues. Leaving it as a manual script, we let the Windows users decide if they want to download the cg.dll with the script, or get it from the nvidia.com site.

shagkur commented 4 years ago
`#!/usr/bin/env bash

# NVIDIA cg.dll installer script by CrystalCT ([crystal@unict.it](mailto:crystal@unict.it))
## Uname string
UNAME=$(uname -a)
...

Your version did not work for me on linux (regex does not work in POSIX shell). If it's fine for your i'd commit it.

I think this script only makes sense for Windows users, to install the cg.dll, so perhaps it might be better to leave it out of the toolchain build workflow? I mean: keep the script, but don't auto-execute it during the normal toolchain build. Also, since it's downloading the .dll files from a third-party website wikidll.com (not from nvidia.com) , some users might be concerned about security issues. Leaving it as a manual script, we let the Windows users decide if they want to download the cg.dll with the script, or get it from the nvidia.com site.

True. It's only used on windows. So perhaps it's better to leave it out for other OSes. However, it's still not correct since '[[' is undefined for POSIX shell (/bin/sh). So does the regex part in the if. So for these reasons it needs to be modified anyways, imo.

shagkur commented 4 years ago

btw, I see that you committed some changes to rsxtest, if you need I can test again with my PS3.

Yes, give it a try

shagkur commented 4 years ago

A few days ago (after @miigotu merged #75 ) I was able to build and run the rsxtest example with the gcc7.2 toolchain. I ran the rsxtest in real PS3 hardware, and it worked fine. On the other hand, rsxtest_spu didn't work. (black screen)

Can you tell me what PS3 OS version you run? I own a PS3 with debug settings, but the 'OS' is rather old. 9 years ago it was not a good idea to update ;) Like i said i'd need to get my old PS3 up and running again to test myself. Not sure if rsx_spu worked at that time. Do you happen to have an old toolchain build too? To compare.

crystalct commented 4 years ago

About NVIDIA dll installer, i modified first IF about UNAME and made a PR. Now it's system indipendent.

bucanero commented 4 years ago

True. It's only used on windows. So perhaps it's better to leave it out for other OSes. However, it's still not correct since '[[' is undefined for POSIX shell (/bin/sh). So does the regex part in the if. So for these reasons it needs to be modified anyways, imo.

oh you're right, if the script syntax doesn't work for /bin/sh then that needs to be addressed first. πŸ‘

Can you tell me what PS3 OS version you run? I own a PS3 with debug settings, but the 'OS' is rather old. 9 years ago it was not a good idea to update ;)

Sure, I've a PS3 running 4.84. (the latest fw right now is 4.86). It's one of the latest super-slim, using the PS3HEN exploit. Btw, right now there are a lot of new tricks and exploits, for example CFW users can downgrade to older firmwares, change CEX to DEX settings, etc. I guess a lot of stuff happened in 9 years. πŸ˜„ if you have time and want to dig a little, you might want to update your ps3

Like i said i'd need to get my old PS3 up and running again to test myself. Not sure if rsx_spu worked at that time. Do you happen to have an old toolchain build too? To compare.

I have a VM with Estwald's gcc4 windows toolchain (PSDK3v2). I think it has some slight changes from the "stock" PSL1GHT, but at least on the compiler side, it's gcc4.

shagkur commented 4 years ago

Sure, I've a PS3 running 4.84. (the latest fw right now is 4.86). It's one of the latest super-slim, using the PS3HEN exploit. Btw, right now there are a lot of new tricks and exploits, for example CFW users can downgrade to older firmwares, change CEX to DEX settings, etc. I guess a lot of stuff happened in 9 years. if you have time and want to dig a little, you might want to update your ps3

Yes, that would be great. I'd love to update it. Perhaps, sometimes in the near future we both can have a chat about how to update my PS3 nowadays etc.

I have a VM with Estwald's gcc4 windows toolchain (PSDK3v2). I think it has some slight changes from the "stock" PSL1GHT, but at least on the compiler side, it's gcc4.

Would it be possible for you, to use gcc4, but compile the latest psl1ght and example?

crystalct commented 4 years ago
`#!/usr/bin/env bash

# NVIDIA cg.dll installer script by CrystalCT ([crystal@unict.it](mailto:crystal@unict.it))
## Uname string
UNAME=$(uname -a)
...

Your version did not work for me on linux (regex does not work in POSIX shell). If it's fine for your i'd commit it.

I think this script only makes sense for Windows users, to install the cg.dll, so perhaps it might be better to leave it out of the toolchain build workflow? I mean: keep the script, but don't auto-execute it during the normal toolchain build. Also, since it's downloading the .dll files from a third-party website wikidll.com (not from nvidia.com) , some users might be concerned about security issues. Leaving it as a manual script, we let the Windows users decide if they want to download the cg.dll with the script, or get it from the nvidia.com site.

True. It's only used on windows. So perhaps it's better to leave it out for other OSes. However, it's still not correct since '[[' is undefined for POSIX shell (/bin/sh). So does the regex part in the if. So for these reasons it needs to be modified anyways, imo.

About security i check CRC. CRC values are genuine from NVIDIA files and then they will be installed. DLL cant be donwloaded directly from NVIDIA. For linux it's simply, the package from NVIDIA install few files into right folders.

bucanero commented 4 years ago

Yes, that would be great. I'd love to update it. Perhaps, sometimes in the near future we both can have a chat about how to update my PS3 nowadays etc.

For sure, if I can help let me know! πŸ‘ we can follow up by chat (IRC, discord, etc.) or in any message board, I'm sure you can get your ps3 up-to-date easily πŸ˜„

I have a VM with Estwald's gcc4 windows toolchain (PSDK3v2). I think it has some slight changes from the "stock" PSL1GHT, but at least on the compiler side, it's gcc4.

Would it be possible for you, to use gcc4, but compile the latest psl1ght and example?

I'll give it a try, and ping back with the feedback

bucanero commented 4 years ago

btw, I see that you committed some changes to rsxtest, if you need I can test again with my PS3.

Yes, give it a try

ok, I've just tested rsxtest on my PS3 with the latest changes:

(my previous build with gcc7 was running fine, showing the red/black bubble moving around)

edit: I'm sharing the elf/self files of both builds just in case rsx-working.zip rsx-notworking.zip

edit2: reverting 09f3d1da07e28002f3460dc49bde7f957dd364c3 fixed rsxtest (runs ok on my ps3)

crystalct commented 4 years ago

Crash on put_vertex() function, exactly here: memcpy((void *) &rsx_vertex[pos_rsx_vertex], (void *) &vertex_data.x, 16);

after 2 vertexpos, the 3rd vertexpos fail.....

shagkur commented 4 years ago

Can you, locally, revert the changes i made to commands.c? In rsxtest i just removed unused var.

DamiΓ‘n Parrino notifications@github.com schrieb am Mo. 29. Juni 2020 um 16:55:

btw, I see that you committed some changes to rsxtest, if you need I can test again with my PS3.

Yes, give it a try

ok, I've just tested rsxtest on my PS3 with the latest changes:

  • rsxtest.self starts
  • it shows the red/black bubble thing and freezes

(my previous build with gcc7 was running fine, showing the red/black bubble moving around)

should I share the elf/self files of those builds?

β€” You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/ps3dev/PSL1GHT/issues/92#issuecomment-651174560, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADQD7SKP5RAHZX63IEU5FDRZCTPJANCNFSM4OJ7F5JQ .

bucanero commented 4 years ago

Can you commit to master, or should I do a pull request?

Yes, I reverted that change to commands.c and the test worked fine again. πŸ‘πŸ»

shagkur commented 4 years ago

Can you commit to master, or should I do a pull request? Yes, I reverted that change to commands.c and the test worked fine again. πŸ‘πŸ»

Okay, i pushed the changes (i did not revert, but used to correct constraint modifier now). Btw. rsxtest_spu still fails?

crystalct commented 4 years ago

My tests work is here: https://github.com/crystalct/tiny3d_Estwald_work

I checked a lot...the culprit is rsx_vertex memory and command: memcpy((void *) &rsx_vertex[pos_rsx_vertex], (void *) &vertex_data.x, 16); libtiny3d.a compiled with gcc 4 and gcc 7 produce the same result in terms of memory addresses and size. Not working lib debug results:

pos_rsx_vertex: 20 position inside rsx_vertex memory
rsx_vertex memory address: 0xc0f01100
Copying: 16 byte - float size: 4
vertex_data.x: 847.000000
vertex_data.y: 0.000000
vertex_data.z: 65535.000000
vertex_data.w: 1.000000
rsx_vertex[pos_rsx_vertex] memory address: 0xc0f01114

Working lib debug results:

pos_rsx_vertex: 20 position inside rsx_vertex memory
rsx_vertex memory address: 0xc0f01100
Copying: 16 byte - float size: 4
vertex_data.x: 847.000000
vertex_data.y: 0.000000
vertex_data.z: 65535.000000
vertex_data.w: 1.000000
rsx_vertex[pos_rsx_vertex] memory address: 0xc0f01114
memcpy ok
put_vertex 2
put_vertex 7
put_vertex 8
...
..
.

Test samples was compiled always with gcc 7, only libtiny3d is compiled one time (working) with gcc 4 and one time (not working) with gcc 7. Maybe is there a difference of memory managment in the lib compiled with gcc 4? Heap size?

wargio commented 4 years ago

what instead you optimizing this out via non memset and use a normal load/store?

crystalct commented 4 years ago

what instead you optimizing this out via non memset and use a normal load/store?

I dont know what are talking about....

A new thing: compiling testfont sample without -O2 (no optimization at all), sample close immediatly because if(paddata.BTN_CROSS){ is already true

shagkur commented 4 years ago

Yes, that would be great. I'd love to update it. Perhaps, sometimes in the near future we both can have a chat about how to update my PS3 nowadays etc.

For sure, if I can help let me know! we can follow up by chat (IRC, discord, etc.) or in any message board, I'm sure you can get your ps3 up-to-date easily

I have a VM with Estwald's gcc4 windows toolchain (PSDK3v2). I think it has some slight changes from the "stock" PSL1GHT, but at least on the compiler side, it's gcc4.

Would it be possible for you, to use gcc4, but compile the latest psl1ght and example?

I'll give it a try, and ping back with the feedback

I installed my DEX PS3 now (Firmware 3.6 tho). I'm now ready to upgrade this little boy :smiley:

bucanero commented 4 years ago

Okay, i pushed the changes (i did not revert, but used to correct constraint modifier now). Btw. rsxtest_spu still fails?

Sorry, I didn't test rsxtest_spu last time. πŸ˜…
I'll check-out the new changes and test both (rsxtest, rsxtest_spu)

bucanero commented 4 years ago

@shagkur , I've tested the samples with the latest changes:

I installed my DEX PS3 now (Firmware 3.6 tho). I'm now ready to upgrade this little boy πŸ˜ƒ

btw, if you need help/info upgrading your ps3, let me know

crystalct commented 4 years ago

Tiny3D has a lot of rsx function rewritten. Is a good idea rewrite rsx layer communication of Tiny3d to use directly rsx functions?

crystalct commented 4 years ago

Great news..... my testfonts samples started on real PS3. Changed memcpy((void *) &(rsx_vertex[pos_rsx_vertex]), (const void *) &(vertex_data.x), 16); in

memcpy((void *) &(rsx_vertex[pos_rsx_vertex]), (const void *) &(vertex_data.x), 4);
memcpy((void *) &(rsx_vertex[pos_rsx_vertex+4]), (const void *) &(vertex_data.y), 4);
memcpy((void *) &(rsx_vertex[pos_rsx_vertex+8]), (const void *) &(vertex_data.z), 4);
memcpy((void *) &(rsx_vertex[pos_rsx_vertex+12]), (const void *) &(vertex_data.w), 4);

This mean that data structures are not saved consecutively in memory? This is a pain?

Edit: vertex_data.x :0x1001227c vertex_data.y :0x10012280 vertex_data.z :0x10012284 vertex_data.w :0x10012288

zeldin commented 4 years ago

I think it's more likely that a memcpy of length 16 uses 64-bit accesses, but your destination address is not 64-bit aligned. memcpy shoudln't make any assumptions about alignment though, so in that case it's a bug...

crystalct commented 4 years ago

So it's just a coincidence that we found this bug inside tiny3d, memcpy could be used everywhere... How and where to solve this bug?

zeldin commented 4 years ago

Yup, if I compile this:

#include <string.h>

void testfunc(unsigned char *dest, const float *src)
{
  memcpy((void *)dest, (const void *)src, 16);
}

with -O2 it becomes:

        ld 10,0(4)
        ld 9,8(4)
        std 10,0(3)
        std 9,8(3)

Interrestingly, if I use -Os instead, it becomes:

        lswi 5,4,16
        stswi 5,3,16
        blr

Does Cell support string instructions in hardware?

zeldin commented 4 years ago

As for how to fix it, remind me: Does Cell support unaligned 64-bit access in regular memory? If it doesn't, then this is a potential problem for any memcpy, and needs to be fixed in gcc. If it's only RSX memory that requires alignment, then it should be acceptable to say that use of gcc-builtin memcpy with RSX destination is not supported. You can use the compiler flag -fno-builtin-memcpy to get the call to real memcpy instead.

zeldin commented 4 years ago

Of course, you could also replace the memcpy call with something like

float *v = (void *) &(rsx_vertex[pos_rsx_vertex]);
*v++ = vertex_data.x;
*v++ = vertex_data.y;
*v++ = vertex_data.z;
*v++ = vertex_data.w;

which will be more efficient than the bytewise copy that memcpy would do. This is what wargio meant by "normal load/store".

crystalct commented 4 years ago

which will be more efficient than the bytewise copy that memcpy would do. This is what wargio meant by "normal load/store".

Now i have understood.... but Tiny3d was made 10 year before i entered into PS3 scene... i'haven't written it.