Sometimes Assertion fails with `_add(B, 0)`, but not with `_force_add(B, 0)`

edlongman commented 5 months ago

A dsPIC33 MCU (16-bit) is being used for data collection.

I add a value on "Sample ID" to a buffer format containing real time measurements. The sample ID is 16bits, so wraps after 2^16 samples,

The schema is as so

enum DevStatus:ubyte {None = 0,  OK = 2} 

table Sample{
    status:SampleStatus;
    id:ushort;
    digital:ushort;
    channels:[short];
}
table SamplesPacket {
    status: DevStatus;
    samples: [Sample];
}

The Code takes the data from a rolling buffer of samples, assembling it into packets and transmitting it.

    ns(SamplesPacket_start_as_root_with_size(B));

    //Index to read sample data from
    i_sample = PACKET_GetFirstInPacket();

    //Start samples vector and add up to Packet_sample_per_packet samples
    ns(SamplesPacket_samples_start(B));
    for(uint16_t i=0; i<PACKET_sample_per_packet; i++){
        sample_t* sample = &(committed_samples[i_sample]);
        // Write sample data into flatbuffer
        ns(Sample_vec_push_start(B));
        ns(Sample_id_add(B, PACKET_GetSampleCount(i_sample)));
        ns(Sample_digital_add(B, sample->digital.DigitalChannels));
        ns(Sample_channels_start(B));
        for(unsigned int j=0; j<SAMPLE_channel_count; j++){
            ns(Sample_channels_push(B, &(sample->channel[j])));
        }
        ns(Sample_channels_end(B));
        // Update read index
        i_sample = PACKET_AddOverflow(i_sample, 1, PACKET_buffer_l);
        ns(Sample_vec_push_end(B));
    }
    ns(SamplesPacket_samples_end(B));
    ns(SamplesPacket_end_as_root(B));

    // Write samples buffer for transmitting
    void* buf;
    size_t size;
    buf = flatcc_builder_finalize_buffer(B, &size);

During finalize_buffer, if the one of the samples has a sample_id == 0;, an assertion failure will sometimes happen. I'm not able to pinpoint at what point in the finalise it happens as I don't know where to look to find it quickly.

Doing ns(Sample_id_force_add(....) appears to solve the problem but I'm not satisfied that something else isn't going on.

Why is the possibility of it being zero causing a assertion failure (and program crash!)?

Why doesn't the same happen with digital_add(...) when the values are also zero, should I used force add there too?

How do I pinpoint the issue?

mikkelfj commented 5 months ago

Without reading your question carefully, I will provide quick guidance. If that does not solve your issue, please let me know.

The id field likely conflicts with internal names. Notably tables have a few functions generated for the table itself that does not refer to any field, but these functions live in the same namespace as field accessors and can potentially confict.

As I recall 'id' is such a field.

By adding a _get to field accessors, this problem is avoided. These are already generated, but the same function without _get suffix is also, and likely the cause of the problem. You can remove field accessors without _get suffix using the flatcc -g flag when generating code. You can also use a different field name like 'serial' or 'ident' instead of 'id'.

I believe the documentation touches on this, as this was the reason the -g flag was added earlier on.

Please let me know how it works out.

edlongman commented 5 months ago

Thank you, I'll try changing that next week, although the id internals field appears to be lower case and mine is Id so should not clash.

I think it's more likely it's something to do with the 16 bit system.

mikkelfj commented 5 months ago

OK, it might be down to how sequence the calls to the flatcc builder. The builder maintains a stack of objects such as tables and vectors, and pops when current top is completed. If you start to mix operations, the object type (frame type) might be incorrect, which can trigger an assertions.

As to force_add vs add, it might be that if you add a value that is the default value of a table field, typically 0 for integer fields, then the builder will simply ignore the operation, while force_add will always execute the operation. This could explain that there is a difference, but not necessarily what is actually wrong.

Although I wrote the code, I do not recall all details. I suggest builder.md more closely.

I think you are first starting the buffer with the root table named SamplesPacket.

This means the stack looks like: buffer, SamplesPacket. In this state you can add table fields, or start new objects like a Samples vector like you do. After starting the Samples vector, the stack is: buffer, SamplesPakcket, SamplesVector In this state you need to complete the Samples vector before adding to the SamplesPacket. I think you get that wrong.

Also note that there are at least two ways to populate a field that is a vector. You can use table_fieldname_start, ... _end, or you just build the vector, then add the resulting refernce as a field. The former is a shorthand of the latter.

Either way, you can only operate on methods that match the currently open stack top context.

edlongman commented 5 months ago

I tried renaming id to serial but it still fails in the same way.

Adding my own assertion meant I could catch where the builder was failing. It failed on the allocate memory. I think it was to do with fragmentation of the memory area, meaning there was not enough contiguous space to allocate on my MCU.

Using force_add on all the fields means it requires a consistent amount of memory. I will run some longer tests to confirm this through the week,

This means the stack looks like: buffer, SamplesPacket. In this state you can add table fields, or start new objects like a Samples vector like you do. After starting the Samples vector, the stack is: buffer, SamplesPakcket, SamplesVector In this state you need to complete the Samples vector before adding to the SamplesPacket. I think you get that wrong.

Regarding ending the Samples vector. That's what I thought I was doing with ns(Sample_vec_push_end(B)). Should I be using something else instead to take it down a stack level.

mikkelfj commented 5 months ago

On assertions, I advice you to look at the builder.c file. There are numerous assertions on logical incorrect use, as it tracks the stack frame types, but they likely need to be enabled with a define. If it is incorrect use, you would get an error, but these are not practical to check.

Continued use after incorrect use error could lead to corruption, but correct use should only lead to problems if you run out of memory. There is a host of things you can do with custom allocators either to track issues in more detail, or implement your own runtime allocator if it becomes critical, but it is rather involved. This includes how much memory is allocated by default for different uses. You could profile and adjust.

As for Samples vector, it does not matter if you use it one way or the other as the alternatives are generally a shorthand notation for the same thing. However, there is a choice of reordering the operations so you create a vector before the table that use it, store the reference to it and add it later. This is only way you can do it in most languages. FlatCC offers a stack for convenience. Creating the vector earlier on can avoid a bit of stack allocation since it does not need to remember a half complete table and vector at the same time. This is likely a non-issue.

There is one major difference though: there are operations to create a vector in one complete operation if you are able to provide the source data directly, and if endian conversion is not required (uint8 types or little endian platform). In this case the vector is not first built on the stack, but emitted directly to the circular buffer that contains the output.

mikkelfj commented 5 months ago

I should add that it is hard to tell if you are making a mistake or not, because it is so easy to overlook something even it it seems right. This is why enabling assertions in builder.c is so useful.

dvidelabs / flatcc

Sometimes Assertion fails with `_add(B, 0)`, but not with `_force_add(B, 0)` #272