Closed kerhong closed 11 months ago
Is it correct to assume that the eventual goal is that all packets should survive parse->encode->parse without changes (ignoring header encryption)?
Thank you for the very thorough bug report. I have pushed a fix to master that passes your test case.
In order to create unit tests for the individual messages you can write a test
wowm
object. It can be auto generated from a specific message through the (to_test_case_string
)[https://github.com/gtker/wow_messages/blob/009948f4caa0fcac8fe85d2c7b19f51dedd7a72c/wow_world_messages/src/world/wrath/smsg_account_data_times.rs#L77] function. If you could obtain a valid message in this format a test would be auto generated that would prevent any regressions, as well as better coverage for downstream implementations in other languages.
Is it correct to assume that the eventual goal is that all packets should survive parse->encode->parse without changes (ignoring header encryption)?
Yes, absolutely. This bug was because I incorrectly added a compressed_data
field to endless arrays (u8[-]
) because these are usually compressed. I removed the field for non-compressed members.
Are you working on wrath-rs
or your own server?
@gtker I'm currently working on a toy project (not public yet) which for now is just a proxy for authserver + worldserver, later also a client implementation.
It's grown from just using wow_messages to try and generate code intermediate_representation.json myself (as I find wow_messages codegen to be very confusing). Is there a reason you didn't use quote?
Will try to write .wowm test cases for future issues I find
It's grown from just using wow_messages to try and generate code intermediate_representation.json
Are you using the intermediate_representation.json
file? I've been assumed that I was the only one using it for wow_messages_python
so I've been doing breaking changes rather freely.
The idea is for users of intermediate_representation.json
to have to do as little actual calculation/thinking as possible. For example by inlining full struct types to avoid having to do any lookup.
Do you have any ideas for improvements?
(as I find wow_messages codegen to be very confusing).
Yeah, I understand completely. The layout and general idea of wow_messages
has changed many times, and there are even sections where I'm not sure I even follow the logic any more because of how many times the requirements have changed.
Is there a reason you didn't use quote?
The line based output that I have makes it significantly easier to debug and track changes. This is also why I don't run rustfmt
on the output before committing.
Are you using the intermediate_representation.json file? I've been assumed that I was the only one using it for wow_messages_python so I've been doing breaking changes rather freely.
No issues as of yet, and I use both intermediate_representation.json
and intermediate_representation_schema.json
(which I transform to rust types using https://github.com/jsontypedef/json-typedef-codegen/), so I'd notice any changes.
JSON with a schema that is processed and simplified already is much easier to hack around than parsing wowm.
Do you have any ideas for improvements?
I am interested in this project, but currently only in the .wowm/intermediate representation part. I hope (probably presumptuously) that I can write a "better" rust implementation from the input data you maintain.
I will be opening PRs about contents of .wowm and wowm->IR translation, but probably not about rest of codegen. That being said- since I am working on generating code form same rules I am noticing possible issues about codegen.
Should I open all kinds of weird issues (and possible issues) I manage to find even if I have no intention on fixing them?
Some things I already have noted:
Fuzzing cannot be easily performed currently due to 2 issues:
write_unencrypted_server
)I can't find a way to read ServerOpcodeMessage and get "total bytes read" after it's read, how about exposing changing ServerOpcodeMessage::read_opcodes
?
// current
// - private
// - caller can't see how r was changed
fn read_opcodes(opcode: u16, body_size: u32, mut r: &[u8]) -> Result<...>
// new
// - public
// - takes `&mut &[u8]` so caller can see how much was read
pub fn read_opcodes(opcode: u16, body_size: u32, r: &mut &[u8]) -> Result<...>
rust_analyzer
work slowly, I think some generated code could be optimized (for parsing, not for execution), not sure where to even startread_sized_c_string_to_vec
is wrong (vec![0u8; user provided u32]
is a crash exploit)
0x7fffff
, then that should be max valid size for read_sized_c_string_to_vec
read_sized_c_string_to_vec
is utf-8 valid (so can't have 0x00
in the middle), then u32
length can just be thrown out and it can be read as null terminated stringbool
implementation is not parse->encode->parse stable (as it reads any non-zero value as true
)
bool
as a type is wrong and should be u8
bool
should fail parsing on other valuesbool
must be replaced by u8
, as there could be servers out there that send other values for bool even if wow client itself doesn'ttag_all
usage is confusing to me. For example cmsg_cancel_cast.wowm has #tag_all versions "1 2 3";
and then test marked as versions = "1.12";
. Is the correct interpretation that tests are:
1 2 3 1.12
- this seems how it works currnetly for wowm->intermediate, can I suggest it should be an error if item explicitly has a version that tag_all already covers.1.12
If you agree- I'll convert these (and more) into issues, just wanted to check in before I start spamming you.
I am interested in this project, but currently only in the .wowm/intermediate representation part. I hope (probably presumptuously) that I can write a "better" rust implementation from the input data you maintain.
You probably can, at least you can probably better optimize it for your use case.
Should I open all kinds of weird issues (and possible issues) I manage to find even if I have no intention on fixing them?
Yes, definitely.
There is no access to writing a header (write_unencrypted_server)
I deliberately wanted to hide as many unnecessary details as possible for users, which is why I also don't expose the opcode numbers.
Do you want to be able to only write a header and no body?
I can't find a way to read ServerOpcodeMessage and get "total bytes read" after it's read, how about exposing changing ServerOpcodeMessage::read_opcodes?
This is also not exposed by design. :)
I probably don't mind changing the function signature of read_opcodes
, but I don't want it public since users really should only be able to read full messages, without having to care about the exact wire format.
The generated code makes rust_analyzer work slowly,
Oh yeah. I'm also not sure how to fix it. I'm thinking it might have something to do with the match statements that match 400+ cases that all do basically the same thing.
I may be able to do some optimizations, but no promises since no matter how you slice it there's 300k+ lines just in wow_world_messages
alone.
There are some types in .wowm (for example "spell") that don't seem to be documented and are just silently treated as u32, the intermediate representation doesn't even contain this type (it's just u32)
Yeah, the intention was always there to have a dedicated Spell
/Item
type that wasn't just a u32
/u16
, but it got a little lost along the way.
read_sized_c_string_to_vec
is wrong (vec![0u8; user provided u32] is a crash exploit)
That's definitely true. This is also something that I just haven't gotten around to because it's only used by clients. I would like to maintain the optimization of being told how long the string is, but there should definitely be a maximum on it.
current
bool
implementation is not parse->encode->parse stable (as it reads any non-zero value as true)
I think this is one of those things where we'll never get a "perfect" solution just different tradeoffs.
For wow_world_messages
/wow_login_messages
I want to make the library as ergonomic as possible for users, which is why I don't want to expose a u8
when the only possible options are negative/positive.
This is complicated by the fact that C/C++ treat any positive integer as true
and the WoW message format almost just being structs dumped on the wire knowing that the client has the same internal representation.
I think it's a good idea to have an issue for this, if only to document a decision.
It might make sense to have bools with a value other than 0
/1
be a panic just to figure out what the behavior actually is.
Some tag_all usage is confusing to me. [..]
This is just straight up an error on my part.
I don't have any real exact guidelines for how to do things, but I think I'm currently of the opinion that tests should be marked only with the exact version that they were tested with.
can I suggest it should be an error if item explicitly has a version that tag_all already covers.
Yes, from how I currently feel about it, I think it should be.
If you agree- I'll convert these (and more) into issues, just wanted to check in before I start spamming you.
Please do. I can't promise a speedy resolution since I'm working on/off on this, but having tracking issues for all these will definitely be good.
I also want to mention the awesome-wow-rust
repo which lists most of the WoW Rust projects, as well as the discord for it.
You should add your project to the repo when it's public.
There are some types in .wowm (for example "spell") that don't seem to be documented and are just silently treated as u32, the intermediate representation doesn't even contain this type (it's just u32)
This is now correctly passed to the IR, but not used in wow_messages
yet.
Incorrect issue.
The original issue was fixed.
I created issues from comments
I think this can be closed
For 3.3.5
SMSG_ACCOUNT_DATA_TIMES
does not survive parse->encode round trip.SMSG_ACCOUNT_DATA_TIMES::read_inner
reads an an extra u32 (data_decompressed_size
) before readingdata
. https://github.com/gtker/wow_messages/blob/f05ce8ceb6dbba6e92b42bb0ae9ddf8444f0fbaf/wow_world_messages/src/world/wrath/smsg_account_data_times.rs#L43-L56The .wowm and also
SMSG_ACCOUNT_DATA_TIMES::write_into_vec
do not have this u32 https://github.com/gtker/wow_messages/blob/f05ce8ceb6dbba6e92b42bb0ae9ddf8444f0fbaf/wow_message_parser/wowm/world/login_logout/smsg_account_data_times.wowm#L45-L55https://github.com/gtker/wow_messages/blob/f05ce8ceb6dbba6e92b42bb0ae9ddf8444f0fbaf/wow_world_messages/src/world/wrath/smsg_account_data_times.rs#L135-L140
I think
data_decompressed_size
is incorrect, as I can't find it in other implementations andCacheMask::PerCharacterCache
has 5 bits set, sodata
should have 5 elements (see comment fordata
in .wowm).Here is a failing test case for this bug: