knurling-rs / defmt

Efficient, deferred formatting for logging on embedded systems
https://defmt.ferrous-systems.com/
Apache License 2.0
750 stars 69 forks source link

Bug: cannot print a big slice #825

Open ValouBambou opened 3 months ago

ValouBambou commented 3 months ago

The log of a big Vec is buggy (it doesn't happen for a small one), for example, with this example code the log is not displayed.

#![no_std]
#![no_main]

extern crate alloc;
use alloc::vec;
use core::mem::MaybeUninit;
use embedded_alloc::Heap;
use embassy_executor::Spawner;
use defmt::println;
use {defmt_rtt as _, panic_probe as _};

#[global_allocator]
static HEAP: Heap = Heap::empty();

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    {
        // initialize the HEAP before using it
        const HEAP_SIZE: usize = 1024 * 100; // 100 kiB
        static mut HEAP_MEM: [MaybeUninit<u8>; HEAP_SIZE] = [MaybeUninit::uninit(); HEAP_SIZE];
        unsafe { HEAP.init(HEAP_MEM.as_ptr() as usize, HEAP_SIZE) }
    }
    let _p = embassy_rp::init(Default::default());

    let slice = vec![(i32::MAX, i32::MAX); 100];
    println!("slice = {:?}", slice);
    loop {}
}

The test setup that I used is a Raspberry Pi Pico with a debug probe using probe-rs and embassy. The version and features of the dependencies are these one:

defmt = { version = "0.3", features = ["alloc"] }
defmt-rtt = "0.4"
embedded-alloc = "0.5.1"

and embassy in patched with this commit hash "ad7d4494fad12f98c7e8e2b776bc12453a66be9a".

Urhengulas commented 3 months ago

Thank you for reporting the issue @ValouBambou. The vec format implementation just delegates to the slice implementation. Can you also observe the problem with a big slice?

I do not have the hardware at hand to test it, so I will try to reproduce it next week.

MathiasKoch commented 3 months ago

The default rtt buffer size in defmt is 1024 bytes. If you log more than this, the output is silently discarded. I think this is what you are seeing here.. You can configure the buffer size using the environment variable DEFMT_RTT_BUFFER_SIZE

See more here https://defmt.ferrous-systems.com/setup.html?highlight=Buffer#memory-use

ValouBambou commented 3 months ago

I can confirm that it is reproducible with slice instead of Vec.

Also, it doesn't really just silently discarded stuff (which is not a perfect solution in my mind, maybe truncating will be better behavior) but also alter some of the logs that came before. This is probably due to the fact that the buffer is not sent while it is not filled or something like so depending on the way it is implemented.

For example, adding these prints:

    let slice = &[(i32::MAX, i32::MAX); 100];
    println!("HELLO");
    println!("slice = {:?}", slice);
    println!("BYE");

leads to this output:

BYE

And I don't think this is the desirable behavior, it makes loses of the preceding logs and makes it harder to follow the chronological events when debugging for instance.

Urhengulas commented 3 months ago

@ValouBambou Does increasing the buffer size (as pointed out by @MathiasKoch) solve your problem?

ValouBambou commented 3 months ago

Yes it does. But in a more general way, maybe truncating will be a more desirable behavior. Also, there is a similar catch when trying to print inside a loop, where some logs are silently ignored.

Urhengulas commented 3 months ago

But in a more general way, maybe truncating will be a more desirable behavior. Also, there is a similar catch when trying to print inside a loop, where some logs are silently ignored.

I agree. Don't have capacity to work on it, but I am happy to review a PR :D

ValouBambou commented 3 months ago

I would be really happy to do it, but I may need some help to fully understand the code base and what part should be modified to implement this behavior.

ValouBambou commented 3 months ago

I'm not sure if it should be a modification in the defmt_rtt crate because that's the only place where the size of the buffer and the available size in the buffer is known. Or in the defmt crate because it is where slice are serialized.

Urhengulas commented 3 months ago

I actually do not know as well, I assume it needs to happen in the serialization process. Maybe @Dirbaio could give a hint, as the person who did the last overhaul of the serialization format?

Dirbaio commented 3 months ago

The default rtt buffer size in defmt is 1024 bytes. If you log more than this, the output is silently discarded. I think this is what you are seeing here..

this is not true at all (or shouldn't). what should be happening is:

if this is not what's happening then there's a bug somewhere. Truncating the slice is not a fix, it's a workaround.

ValouBambou commented 1 month ago

I can't observe this correct behavior you described on my setup in some circumstances. The bug is triggered by trying to print too quickly a lot of messages, or with big slices. For example, this simple loop only prints the last iterations and skip the first ones.

for i in 0..1000 {
    println!("{}", i);
}

Produces these logs when using probe-rs run.

922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
ValouBambou commented 1 month ago

And when doing 10x more iterations, the logs are even weirder:

511
512
513
514
515
620
621
622
623
624
831
832
937
938
939
940
941
942
1047
1048
1049
1050
1155
1156
1157
1158
1159
1160
1161
1162
1163
1267
1268
1475
1476
1477
1478
1479
1480
1481
1482
1587
1588
1589
1590
1591
1592
1593
1698
1699
1700
1701
1702
1703
1704
1705
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1924
1925
1926
1927
1928
1929
2033
2034
2035
2036
2037
2038
2039
2040
2041
2145
2146
2147
2148
2149
2254
2255
2256
2257
2258
2259
2260
2261
2262
2366
2367
2368
2369
2370
2371
2372
2373
2683
2684
2685
2686
2687
2791
2792
2793
2794
2795
2796
2797
2798
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
3015
3016
3017
3018
3019
3124
3125
3126
3127
3128
3129
3130
3131
3234
3235
3236
3237
3238
3239
3334
3335
3336
3337
3338
3339
3340
3435
3436
3437
3438
3439
3440
3721
3722
3723
3724
3725
3820
3821
3822
3823
3824
3825
3826
3827
724724
3923
3924
3925
4020
4021
4022
4023
4024
4025
4026
4121
4122
4123
4124
4125
4126
4127
4128
4223
4224
4225
4320
4321
4322
4323
4324
4325
4326
4421
4422
4423
4424
4425
4426
4427
4428
4523
4524
4525
4526
4527
4528
4623
4624
4625
4626
4627
4628
4629
4724
4725
4726
4727
4728
4729
4824
4825
5107
5108
5109
5110
5205
5206
5207
5208
5209
5210
5305
5306
5307
5308
5309
5310
5311
5406
5407
5408
5409
5410
5505
5506
5601
5602
5603
5604
5605
5700
5701
5702
5703
5704
5705
5706
5707
5708
5803
5804
5805
5806
726703
5902
5903
5904
5905
6000
6001
6002
6003
6004
6005
6006
6007
6102
6103
6104
6105
6106
6201
6202
6203
6204
6205
6206
6207
6208
6209
6304
6305
6306
6307
6308
6403
6404
6405
6406
6407
6408
6691
6692
6787
6788
6789
6790
6791
6792
6887
6888
6889
6890
6891
6892
6893
6894
6989
7178
7179
7180
7368
7463
7464
7465
7466
7467
7562
7563
7564
7565
7566
7567
7662
7663
7664
7665
7666
7667
7668
7763
7764
7765
7860
7861
7862
7863
7864
7865
7866
9920
9921
9922
9923
9924
9925
9926
9927
9928
9929
9930
9931
9932
9933
9934
9935
9936
9937
9938
9939
9940
9941
9942
9943
9944
9945
9946
9947
9948
9949
9950
9951
9952
9953
9954
9955
9956
9957
9958
9959
9960
9961
9962
9963
9964
9965
9966
9967
9968
9969
9970
9971
9972
9973
9974
9975
9976
9977
9978
9979
9980
9981
9982
9983
9984
9985
9986
9987
9988
9989
9990
9991
9992
9993
9994
9995
9996
9997
9998
9999
Urhengulas commented 1 month ago

It might be the case that probe-rs does not put the RTT channel into blocking mode, which is the behaviour probe-run had. That could cause this problem, could it not?

ValouBambou commented 1 month ago

That's weird, from what I see in probe-rs source code (with grep) the only ChannelMode created is BlockIfFull which seems to be the default. I guess that this mode is supposed to have the same meaning as the previous blocking behavior of probe-run.

ValouBambou commented 1 month ago

I updated probe-rs and now got a similar behavior but with different logs. Here, instead of 10_000 lines, there are only 9_394. The missing one are the first 228 lines, which are skipped. And the rest are dropped randomly every 25 iterations in average (worst delta is 14 and best is 83).

Here is a little python script I used to get the numbers from the logs.

with open("tmp.log", "r") as f:
    nums = list(map(int, f.readlines()))
    print(f"{len(nums)} / 10_000 lines expected")
    first = nums[0]
    print(f"missing first {first} lines")
    missed = [miss for a, b in zip(nums, nums[1:]) for miss in range(a + 1, b)]
    print(f"and {len(missed)} lines during all other logs")
    step_miss = [b - a for a, b in zip(missed, missed[1:])]
    avg_miss = sum(step_miss) / len(step_miss)
    print(f"in average 1 miss every {int(avg_miss)} lines")
    print(f"at best 1 miss every {max(step_miss)} lines")
    print(f"at worst 1 miss every {min(step_miss)} lines")
    print(f"missed lines => {missed}")

and the logs used in attached file.

tmp.log