Closed johannesvollmer closed 2 years ago
The json deserialization seems to be the problem. Stack overflow seems to happen here
Note: The bot existed and has some updates. It's sending the same json every time. It's a json list of updates, not looking suspicious to me
This is an excerpt of the response from telegram that crashed the json parser (ids scrambled):
{
"ok": true,
"result": [
{
"update_id": 77345345072,
"poll": {
"id": "5345345261554",
"question": "What would you like to order, master?",
"options": [
{
"text": "The Amici for some unhealthy snack",
"voter_count": 1
},
{
"text": "The Beef Club for distinguished Ladies and Gentlemen",
"voter_count": 0
},
{
"text": "Kaishi Combo for the extraordinarily exotic",
"voter_count": 1
}
],
"total_voter_count": 2,
"is_closed": false,
"is_anonymous": true,
"type": "regular",
"allows_multiple_answers": false
}
},
{
"update_id": 7343450073,
"poll": {
"id": "544757133453451554",
"question": "What would you like to order, master?",
"options": [
{
"text": "The Amici for some unhealthy snack",
"voter_count": 1
},
{
"text": "The Beef Club for distinguished Ladies and Gentlemen",
"voter_count": 0
},
{
"text": "Kaishi Combo for the extraordinarily exotic",
"voter_count": 0
}
],
"total_voter_count": 1,
"is_closed": false,
"is_anonymous": true,
"type": "regular",
"allows_multiple_answers": false
}
},
{
"update_id": 7734345074,
"message": {
"message_id": 34345,
"from": {
"id": 2934534548,
"is_bot": false,
"first_name": "xxxx"
},
"chat": {
"id": -13453457001,
"title": "bot test group",
"type": "supergroup"
},
"date": 1634535345,
"text": "Nice"
}
},
]
}
and then some more updates. about 500 lines when the json is formatted
example async_get_me.rs
works just fine
example reply_to_message_updates.rs
also overflows
It also overflows for the following response (ids scrambled):
{
"ok": true,
"result": [
{
"update_id": 234523452345,
"message": {
"message_id": 234532452345,
"from": {
"id": 23453452435,
"is_bot": false,
"first_name": "Johannes",
"username": "cvyxcvxycv",
"language_code": "en"
},
"chat": {
"id": 234523452435,
"first_name": "Johannes",
"username": "cvyxcvxycv",
"type": "private"
},
"date": 234523452345,
"text": "hi"
}
}
]
}
@johannesvollmer yes, this issue is happening only on windows. we need to investigate this.
I haven't seen this happening on *nix
cc @EdJoPaTo
I tried to add the following test to async_telegram_api_impls.rs
test module
(ids scrambled)
#[test]
fn parse_message_success() {
let response_string = r#"
{
"ok": true,
"result": [
{
"update_id": 234523452345,
"message": {
"message_id": 234532452345,
"from": {
"id": 23453452435,
"is_bot": false,
"first_name": "XXX",
"username": "cvyxcvxycv",
"language_code": "en"
},
"chat": {
"id": 234523452435,
"first_name": "XXX",
"username": "cvyxcvxycv",
"type": "private"
},
"date": 234523452345,
"text": "hi"
}
}
]
}
"#;
let json_result: MethodResponse<Vec<Update>> = {
serde_json::from_str(response_string).unwrap()
};
}
I cannot even get it to run lol
I got it to run but the test did not fail. Running the example still overflows though, exactly at the serde_json
call
Have you tried creating a bot, sending messages on telegram before starting once, and then running the bot with those expected updates?
Maybe #8 is related here as it was a windows only problem too? Would be interesting to find out the cause.
I wanted to present it to my colleagues tomorrow. Any ideas what I can try next?
Calling serde_json::from_str(message)
succeeds within a test, but while running the example fails. The test passes with the same message string as the example, how can this be? Maybe some kind of global state that is different when running in testing mode than in example mode?
This is what I tried:
dbg!(&message);
dbg!(std::any::type_name::<T>()); // gets printed as expected.
let json_result: Result<T, serde_json::Error> = serde_json::from_str(&message);
dbg!(json_result.is_ok()); // never gets printed.
Interesting News! When dividing serde_json::from_str
into two separate stages, it works:
Replace
with
let json_result: Result<T, serde_json::Error> = serde_json::Value::from_str(&message).and_then(serde_json::from_value);
Of course, this is a barely a work-around, the performance will be inferior due to the increased number of allocations and two stages
@EdJoPaTo yes seems like the same problem
after reading https://github.com/rust-lang/rust/issues/80289, I now have a theory:
std::mem::size_of::<Update>()
is 30384bytes. This means a single Update
instance is 30kb! I presume this type is allocated a lot, maybe even recursively, in the json deserialization functionIf this theory turns out to be true, I see three solutions:
struct Update
and in struct Message
(assuming most fields will never be set at the same time)Option<Box<Message>>
in struct Update
,
or box some fields in the Message
type (may have performance implications)The following observations make me think the stack overflow theory is actually a legitimate: Boxing all Message
fields in struct Update
made the example run on my device, finally. The type Update
, formerly 30kb, now has 9872b, which is about 1kb. Boxing all the fields in Update
makes it about 120 bytes.
However, I think making these fields an enum instead of boxing them would be the cleanest solution.
I have a fork with the quickfix changes, it seems to work for now. If the enum solution turns out to take too long, this could serve as a temporary work-around. What do you think? Any ideas how we can verify the stack overflow theory?
That's interesting!
Boxing everything seems like horrible code to me. And the storage is not gone, its just elsewhere (on the heap) shifting the problem. Designing the structs differently to improve their memory usage (whereever it is stored) is the best solution to me. This should solve this problem and results in more effective RAM uses on all platforms.
Using an enum for Update
seems like an interesting solution here. Due to it being used a lot changes here should have the most impact overall. Even if that requires custom (de)serialization code. Taking a look at teloxide they did exactly that: https://github.com/teloxide/teloxide-core/blob/master/src/types/update.rs (Their code is MIT
licensed).
Not sure if we could just use the teloxide-core create for the types in general :thinking:
Even if that requires custom (de)serialization code.
I think we might get away without custom de/serialization, using externally tagged enums and struct flattening. That way, we might just do the following change (field renaming ommited for brevity):
#[derive(Serialize, Deserialize)]
struct Update {
update_id: u32,
#[serde(flatten)]
content: UpdateContent,
}
#[derive(Serialize, Deserialize)]
enum UpdateContent {
#[serde(rename="message")]
Message(Message),
EditedMessage(Message),
Poll(Poll),
// ...
}
If I'm not mistaken, this should result in the following json:
{
"update_id": 32,
"message": {
"message_id": 32,
...
}
}
Which is the json we are currently having, isn't it?
Again, (as I'm unsure how to interpret the official telegram documentation) this will only work if all of these states never occur at the same time, otherwise an enum will not make sense obviously
Again, (as I'm unsure how to interpret the official telegram documentation) this will only work if all of these states never occur at the same time, otherwise an enum will not make sense obviously
The official documentation states exactly that: https://core.telegram.org/bots/api#update
haha didn't find that one. So, any objections about the enum approach? @ayrat555
For memory representation this would be optimal, since the enum will only be about as large as the largest variant, which is probably the Message
type. Which would probably also be refactored to contain an enum. So, in the end, memory usage should go down by a lot.
no objections.
@johannesvollmer great investigation skills. good work!
Not sure if we could just use the teloxide-core create for the types in general 🤔
@EdJoPaTo I'd rather keep teloxide and Frankenstein independent.
@EdJoPaTo I'd rather keep teloxide and Frankenstein independent.
Their types look useful. Using the bot api types in two projects would simplify a lot of work. And they will be the same for every Telegram Library. But yeah, teloxide-core seems to do more than just the types.
Also, I noticed that the allowed_updates
functionality could be made type safe. Right now, one might pass an invalid string to the function.
Look at teleoxide for inspiration: https://github.com/teloxide/teloxide-core/blob/master/src/types/allowed_update.rs
Go ahead and create some pull requests. I will happily review them :)
In the message documentation, I can't find any statement as to whether some of the fields will be present at the same time. Nevertheless, teloxide has an enum for some of the fields
There's probably no guarantee that two of those fields might be present in the future. Do we want to take the risk?
I think as the first step we can just migrate Update to enum
btw @johannesvollmer are you working on this?
I can try to pick this up this week (most probably on the weekend) if you're not
had the change done today in the morning, it actually was a task of only 5 minutes, but i feel like some more tests should be added though
see #62
Merged. Awesome, thanks!
Hi! I tried running the example
async_reply_to_message_updates.rs
. It crashed with a stack overflow error:I was able to pinpoint the overflowing call in the example:
https://github.com/ayrat555/frankenstein/blob/1217d29c3b6c8f9866d0c21cbba42517e62ce52e/examples/async_reply_to_message_updates.rs#L18
Running
frankenstein = { version = "0.13.0", features = ["async-http-client"] }
,rustc 1.60.0 (7737e0b5c 2022-04-04)
, Win 10.I will investigate further. Maybe the error is within the example, or my project setup