Add missing test for encode_utf8

CXWorks commented 10 months ago

Hi,

Thanks for your time & patience to review this PR.

We are researchers focusing on Rust unit tests by LLM. By examine the existing code, we found a unit test can be added to improve the repo's overall unit test coverage(this project is already been well tested). The code region to cover is:

https://github.com/bincode-org/bincode/blob/73258a773595d9c0b8367f8a63968a1e8a528849/src/enc/impls.rs#L328-L335

Thanks again for reviewing.

VictorKoenders commented 10 months ago

IoWriter is only enabled when feature std is enabled, please enable this test accordingly or use one of the &[u8] functions

CXWorks commented 10 months ago

Thanks for your time, I add the std only flag on the unit test to pass the CI pipeline.

VictorKoenders commented 10 months ago

I'm not sure if this PR adds something to our code base. To a reader there is a test called test_encode_utf8 that tests a singular utf8 character.

What does this \u{1F600} character mean?
What does this test do? Why does it check for 4 bytes written?
What was the reason this test was added?

If your goal is to add "random input to make sure bincode doesn't crash" then I think it's better to improve our fuzzing framework, that's what it exists for.

To me it would make a lot more sense to, for example:

Test a list of well-known (or not so well-known) characters that trip up code bases
- make sure they all encode to a well known length
Test a list of garbage utf8 data and see bincode correctly reject it
etc

codecov[bot] commented 10 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (73258a7) 57.30% compared to head (b166e71) 57.32%.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## trunk #683 +/- ## ========================================== + Coverage 57.30% 57.32% +0.01% ========================================== Files 51 51 Lines 4335 4344 +9 ========================================== + Hits 2484 2490 +6 - Misses 1851 1854 +3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

CXWorks commented 10 months ago

Hi, thanks for your feedback. Sorry I did't find the correct place to in the existing unit test for the target function, I think it's better to merge this test in the following location(I will fix it shortly).

https://github.com/bincode-org/bincode/blob/73258a773595d9c0b8367f8a63968a1e8a528849/tests/basic_types.rs#L36-L39

This PR is trying to add test coverage for the region shown above in the encode_utf8 function. We found the target function is tested by the unit tests but not fully covered. Below are my answers to your questions

I'm not sure if this PR adds something to our code base. To a reader there is a test called test_encode_utf8 that tests a singular utf8 character.

What does this \u{1F600} character mean?

Any utf8 with 4 length can fit to cover the region. This character is generated by machine so it doesn't have special meanings. I can create more examples for this test manually,

What does this test do? Why does it check for 4 bytes written?

I will merge this test data into the existing testing frmework.

What was the reason this test was added?

As mentioned above, the target function is tested but not fully covered, by adding this test, all the regions in the function is tested. BTW, this test is fully auto generated(from detection to the executable test), we just want to help improve the test coverage. If you feel uncomfortable about it :pensive:, feel free to close the PR, thanks for your time & patience.

If your goal is to add "random input to make sure bincode doesn't crash" then I think it's better to improve our fuzzing framework, that's what it exists for.

I inspect the current fuzzing framework, I don't see the char as target to be tested and am happy to add this type into the fuzzing framework, do you want me to do that?

To me it would make a lot more sense to, for example:

Test a list of well-known (or not so well-known) characters that trip up code bases

make sure they all encode to a well known length

Test a list of garbage utf8 data and see bincode correctly reject it

etc

These are good suggections, I can create more examples for this test manually.

bincode-org / bincode

Add missing test for encode_utf8 #683

Codecov Report