Open Larkooo opened 11 months ago
Hey @Larkooo, could you provide an example?
Yes, our union C structs look like
typedef struct {
enum tag;
union {
struct struct_;
enum enum_;
...
}
} ty;
first of all, it seems like the field offsets are wrong, they are all 0 when for eg. struct and enum should be of fieldoffset(8)
and the generated fields dont use the anonymous fields - which seems to be the issue for why the offsets are wrong i believe. (the generated metadata from castffi seems to be ok tho)
dojo.h if you were wondering, this is the header file we're having an issue with. i have a scoped fork of c2cs so i fixed it for our use case but best would be to look into the anonymous fields issue
When looking into it, with the new version of castffi
you can do the following for your extract config so there is only one file:
{
"inputFilePath": "./dojo.h",
"targetPlatforms": {
"windows": {
"x86_64-pc-windows-msvc": {},
"aarch64-pc-windows-msvc": {}
},
"macos": {
"aarch64-apple-darwin": {},
"x86_64-apple-darwin": {}
},
"linux": {
"x86_64-unknown-linux-gnu": {},
"aarch64-unknown-linux-gnu": {}
}
}
}
Other than that I'm looking at Primitive
an example and something indeed seems to be wrong.
C# code:
[CNode(Kind = "Struct")]
[StructLayout(LayoutKind.Explicit, Size = 40, Pack = 8)]
public struct Primitive
{
[FieldOffset(0)] // size = 4
public Primitive_Tag tag;
[FieldOffset(0)] // size = 1
public byte u8;
[FieldOffset(0)] // size = 2
public ushort u16;
[FieldOffset(0)] // size = 4
public uint u32;
[FieldOffset(0)] // size = 8
public ulong u64;
[FieldOffset(0)] // size = 16
public fixed byte _u128[16]; // uint8_t[16]
public readonly Span<byte> u128
{
get
{
fixed (Primitive* @this = &this)
{
var pointer = &@this->_u128[0];
var span = new Span<byte>(pointer, 16);
return span;
}
}
}
[FieldOffset(0)] // size = 32
public fixed byte _u256[32]; // uint64_t[4]
public readonly Span<ulong> u256
{
get
{
fixed (Primitive* @this = &this)
{
var pointer = &@this->_u256[0];
var span = new Span<ulong>(pointer, 4);
return span;
}
}
}
[FieldOffset(0)] // size = 4
public uint u_size;
[FieldOffset(0)] // size = 1
public CBool p_bool;
[FieldOffset(0)] // size = 32
public FieldElement felt252;
[FieldOffset(0)] // size = 32
public FieldElement class_hash;
[FieldOffset(0)] // size = 32
public FieldElement contract_address;
}
C code:
typedef struct Primitive {
Primitive_Tag tag;
struct {
union {
struct {
uint8_t u8;
};
struct {
uint16_t u16;
};
struct {
uint32_t u32;
};
struct {
uint64_t u64;
};
struct {
uint8_t u128[16];
};
struct {
uint64_t u256[4];
};
struct {
uint32_t u_size;
};
struct {
bool p_bool;
};
struct {
struct FieldElement felt252;
};
struct {
struct FieldElement class_hash;
};
struct {
struct FieldElement contract_address;
};
};
};
} Primitive;
I would expect it to be like:
[FieldOffset(0)] // size = 4
public Primitive_Tag tag;
[FieldOffset(4)] // size = 1
public byte u8;
[FieldOffset(4)] // size = 2
public ushort u16;
[FieldOffset(4)] // size = 4
public uint u32;
[FieldOffset(4)] // size = 8
public ulong u64;
I think there are improvements to be made for both castffi
and c2cs
in this case. I never experienced such usage of union
and struct
in C before. I can add some tests to capture such usage.
Yes, as I said I have fixed that issue in my fork of c2cs so I would be open to contributing to castffi/c2cs if I know where to look at and if there are written tests for those use cases, as it is quite difficult to get through the C parsing part of the code. As for the FieldOffset(s) I think that it should be 8 as the struct packing is set to 8 bytes.
I think that it should be 8 as the struct packing is set to 8 bytes.
Yeah I suppose so.
If you put up a MR that be great.
The code was basically manually tested against versus C libraries that I come across. Your C code is interesting because I have seen anonymous unions being used as such but never have I seen someone use anonymous structs like that. I have added tests but they don't cover anonymous structs and anonymous unions like this yet.
c2cs
should be pretty easy to deal with. castffi
is probably the hardest part. Both have identical test setup by using some C code for tests.
The related C code that "handles" a struct would be here for castffi
: https://github.com/bottlenoselabs/CAstFfi/blob/main/src/cs/production/CAstFfi.Tool/Features/Extract/Domain/Explore/Handlers/StructExplorer.cs#L31.
There is a similar one for unions: https://github.com/bottlenoselabs/CAstFfi/blob/main/src/cs/production/CAstFfi.Tool/Features/Extract/Domain/Explore/Handlers/UnionExplorer.cs#L31.
What the harder part is determining how to identify a cursor so it can be "handled". Basically how it works for castffi
is that using libclang
, "cursors" in the abstract syntax tree can be "visited" recursively. Once visited they are enqueued into a double ended queue data structure for processing at a later stage (separated into 3 parts as variables, functions, and types) as "info nodes". The names are cached so they are not added again in case a cursor is seen multiple times. Once every "top level" cursor is visited then they are "explored" (i.e. processed). Note that as cursors are processed their "descendants" are "visited" leading to a recursive pattern until there are not more cursors to "visit" or "explore".
Anonymous fields have a type and that anonymous type is given a name. The code is here https://github.com/bottlenoselabs/CAstFfi/blob/main/src/cs/production/CAstFfi.Tool/Features/Extract/Domain/Explore/ExploreContext.cs#L73 then here https://github.com/bottlenoselabs/CAstFfi/blob/main/src/cs/production/CAstFfi.Tool/Features/Extract/Domain/Explore/ExploreContext.cs#L196. This happens so the name can be later looked up for the extracted data.
Expected test data would be added to https://github.com/bottlenoselabs/CAstFfi/tree/main/src/cs/tests/CAstFfi.Tests/Data/Values with a new folder for Structs
and Unions
per target platform. Changing the regenerateFiles
to true
in the constructor here https://github.com/bottlenoselabs/CAstFfi/blob/main/src/cs/tests/CAstFfi.Tests/TestCCode.cs#L100 can be used to generate the expected test data. There already exists a method GetRecord
and TryGetRecord
which can be used for tests: https://github.com/bottlenoselabs/CAstFfi/blob/main/src/cs/tests/CAstFfi.Tests/TestCCodeAbstractSyntaxTree.cs#L134. Though perhaps it should be renamed for Struct
and Union
.
the anonymous field dont seem to be used when generating a union struct