rgamble / libcsv

Fast and flexible CSV library written in pure ANSI C that can read and write CSV data.
GNU Lesser General Public License v2.1
181 stars 40 forks source link

Memory leaks #3

Closed bobhairgrove closed 7 years ago

bobhairgrove commented 7 years ago

These memory leaks were reported by running csvcheck, csvcut and csvgrep under valgrind, but some of them seem to be coming from libcsv. However, csvcount passed the leak test, so the problems might be in csvutils.

I am attaching the CSV file I used to run these: my_test.csv.zip

csvcheck:

bob@mylaptop:~/libcsv-master/tests$ valgrind --leak-check=full csvcheck my_test.csv
==27156== Memcheck, a memory error detector
==27156== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==27156== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==27156== Command: csvcheck my_test.csv
==27156== 
my_test.csv well-formed
==27156== 
==27156== HEAP SUMMARY:
==27156==     in use at exit: 128 bytes in 1 blocks
==27156==   total heap usage: 4 allocs, 3 frees, 5,800 bytes allocated
==27156== 
==27156== 128 bytes in 1 blocks are definitely lost in loss record 1 of 1
==27156==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27156==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27156==    by 0x4E3ABA2: csv_increase_buffer (libcsv.c:291)
==27156==    by 0x4E3B745: csv_parse (libcsv.c:325)
==27156==    by 0x40119C: check_file (in /usr/local/bin/csvcheck)
==27156==    by 0x401497: main (in /usr/local/bin/csvcheck)
==27156== 
==27156== LEAK SUMMARY:
==27156==    definitely lost: 128 bytes in 1 blocks
==27156==    indirectly lost: 0 bytes in 0 blocks
==27156==      possibly lost: 0 bytes in 0 blocks
==27156==    still reachable: 0 bytes in 0 blocks
==27156==         suppressed: 0 bytes in 0 blocks
==27156== 
==27156== For counts of detected and suppressed errors, rerun with: -v
==27156== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

csvcut:

bob@mylaptop:~/libcsv-master/tests$ valgrind --leak-check=full csvcut -f 1 my_test.csv
==27168== Memcheck, a memory error detector
==27168== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==27168== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==27168== Command: csvcut -f 1 my_test.csv
==27168== 
"lastname"
"Doe"
"Lamb"
(...snip...)
==27168== 
==27168== HEAP SUMMARY:
==27168==     in use at exit: 162 bytes in 3 blocks
==27168==   total heap usage: 22 allocs, 19 frees, 6,444 bytes allocated
==27168== 
==27168== 2 bytes in 1 blocks are definitely lost in loss record 1 of 3
==27168==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27168==    by 0x4025A6: xmalloc (in /usr/local/bin/csvcut)
==27168==    by 0x4026E9: Strndup (in /usr/local/bin/csvcut)
==27168==    by 0x401A16: field_spec_cb1 (in /usr/local/bin/csvcut)
==27168==    by 0x4E3ADA9: csv_fini (libcsv.c:185)
==27168==    by 0x4015FD: process_field_specs (in /usr/local/bin/csvcut)
==27168==    by 0x4024B4: main (in /usr/local/bin/csvcut)
==27168== 
==27168== 128 bytes in 1 blocks are definitely lost in loss record 3 of 3
==27168==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27168==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27168==    by 0x4E3ABA2: csv_increase_buffer (libcsv.c:291)
==27168==    by 0x4E3B745: csv_parse (libcsv.c:325)
==27168==    by 0x4015D2: process_field_specs (in /usr/local/bin/csvcut)
==27168==    by 0x4024B4: main (in /usr/local/bin/csvcut)
==27168== 
==27168== LEAK SUMMARY:
==27168==    definitely lost: 130 bytes in 2 blocks
==27168==    indirectly lost: 0 bytes in 0 blocks
==27168==      possibly lost: 0 bytes in 0 blocks
==27168==    still reachable: 32 bytes in 1 blocks
==27168==         suppressed: 0 bytes in 0 blocks
==27168== Reachable blocks (those to which a pointer was found) are not shown.
==27168== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==27168== 
==27168== For counts of detected and suppressed errors, rerun with: -v
==27168== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

csvgrep:

bob@mylaptop:~/libcsv-master/tests$ valgrind --leak-check=full csvgrep -f 1 Doe my_test.csv
==27209== Memcheck, a memory error detector
==27209== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==27209== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==27209== Command: csvgrep -f 1 Doe my_test.csv
==27209== 
"Doe","John","SE-1","CS-101/EN-103/PH-102/MA-103","john.doe@gmail.com"
"Doe","John","SE-1","CS-101/EN-103/PH-102/MA-103","john.doe@gmail.com"
(...snip...)
==27209== 
==27209== HEAP SUMMARY:
==27209==     in use at exit: 11,693 bytes in 965 blocks
==27209==   total heap usage: 1,928 allocs, 963 frees, 23,234 bytes allocated
==27209== 
==27209== 2 bytes in 1 blocks are definitely lost in loss record 1 of 20
==27209==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27209==    by 0x402B80: xmalloc (in /usr/local/bin/csvgrep)
==27209==    by 0x402CC3: Strndup (in /usr/local/bin/csvgrep)
==27209==    by 0x401865: field_spec_cb1 (in /usr/local/bin/csvgrep)
==27209==    by 0x4E3ADA9: csv_fini (libcsv.c:185)
==27209==    by 0x401728: process_field_specs (in /usr/local/bin/csvgrep)
==27209==    by 0x40292E: main (in /usr/local/bin/csvgrep)
==27209== 
==27209== 128 bytes in 1 blocks are definitely lost in loss record 15 of 20
==27209==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27209==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27209==    by 0x4E3ABA2: csv_increase_buffer (libcsv.c:291)
==27209==    by 0x4E3B745: csv_parse (libcsv.c:325)
==27209==    by 0x4016FD: process_field_specs (in /usr/local/bin/csvgrep)
==27209==    by 0x40292E: main (in /usr/local/bin/csvgrep)
==27209== 
==27209== 4,203 bytes in 933 blocks are definitely lost in loss record 19 of 20
==27209==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27209==    by 0x402B80: xmalloc (in /usr/local/bin/csvgrep)
==27209==    by 0x402CC3: Strndup (in /usr/local/bin/csvgrep)
==27209==    by 0x401C8A: matches_pattern (in /usr/local/bin/csvgrep)
==27209==    by 0x40228E: cb2 (in /usr/local/bin/csvgrep)
==27209==    by 0x4E3B110: csv_parse (libcsv.c:414)
==27209==    by 0x4024CC: grep_file (in /usr/local/bin/csvgrep)
==27209==    by 0x402AE0: main (in /usr/local/bin/csvgrep)
==27209== 
==27209== LEAK SUMMARY:
==27209==    definitely lost: 4,333 bytes in 935 blocks
==27209==    indirectly lost: 0 bytes in 0 blocks
==27209==      possibly lost: 0 bytes in 0 blocks
==27209==    still reachable: 7,360 bytes in 30 blocks
==27209==         suppressed: 0 bytes in 0 blocks
==27209== Reachable blocks (those to which a pointer was found) are not shown.
==27209== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==27209== 
==27209== For counts of detected and suppressed errors, rerun with: -v
==27209== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
bobhairgrove commented 7 years ago

I just ran csvcheck through a debugger with a breakpoint set on the library function csv_free(). It was never called.

Seems like csv_fini() needs to call it before returning.

bobhairgrove commented 7 years ago

After looking at some of the source code for csvutils, and re-reading the man csv page, it seems that the responsibility for calling csv_free() is left up to the caller. However, I am wondering if it would be better design to have csv_fini() call this? Calling csv_free() multiple times is never a problem because after calling it, the pointer to the freed memory block is set to NULL. However, it is not orthogonal to call csv_init(), process the data, then call csv_fini(), and THEN still have to call csv_free(). Just IMHO.

If you'd rather leave it this way, we can close this issue now. However, it would be nice if you could fix the memory leaks in csvutils.

rgamble commented 7 years ago

The complement to csv_init is csv_free, csv_fini is part of the parsing process and the same parser object can continue to be used to parse new CSV data after calling csv_fini without having to call csv_free or csv_init again.

bobhairgrove commented 7 years ago

OK, I am closing this now that it is clear that the errors from valgrind are coming from csvutils and not from libcsv.