gphoto / libgphoto2

The libgphoto2 camera access and control library.
GNU Lesser General Public License v2.1
1.03k stars 324 forks source link

`gp_{camera_folder,filesystem}_put_file` leak memory on fd & handler input files #837

Open RReverser opened 2 years ago

RReverser commented 2 years ago

gp_file_get_data_and_size docs state:

 * For regular CameraFiles, the pointer to data that is returned is
 * still owned by libgphoto2 and its lifetime is the same as the #file.
 *
 * For filedescriptor or handler based CameraFile types, the returned
 * data pointer is owned by the caller and needs to be free()d to avoid
 * memory leaks.

However, looking at all the usages of gp_file_get_data_and_size, I found that a lot of them - pretty much all of put_file_func implementations, as well as few others - just assume that file is a "regular" in-memory file and don't free the retrieved data if it's actually a GP_FILE_ACCESSTYPE_FD or GP_FILE_ACCESSTYPE_HANDLER kind of file.

Unless I'm missing something, this means that uploading files created from gp_file_new_from_fd or gp_file_new_from_handler via gp_{camera_folder,filesystem}_put_file will always leak memory.

I walked through the list of gp_file_get_data_and_size occurences in the codebase and tried to limit it to those that exhibit this issue (but I might have missed a few):

https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/ax203/library.c#L231 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/canon/serial.c#L964 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/fuji/library.c#L223 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/kodak/dc210/library.c#L64 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/konica/qm150.c#L521 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/minolta/dimagev/upload.c#L54 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/panasonic/dc1000.c#L442 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/panasonic/dc1580.c#L563 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/ptp2/library.c#L8339 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/ptp2/library.c#L9078 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/ricoh/library.c#L275 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/sierra/library.c#L1443 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/sierra/sierra.c#L601 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/sierra/sierra.c#L754 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/sonydscf55/sony.c#L785 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/soundvision/soundvision.c#L400 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/st2205/library.c#L304 https://github.com/gphoto/libgphoto2/blob/cf154a3d81fcd4b6ce12c57f428c2e712607e109/camlibs/tp6801/library.c#L226

To be fair, I think this is an easy mistake to make for downstream consumers as well, since rules around ownership of the data returned from gp_file_get_data_and_size are confusing and depend on how the file was created. To makes matters worse, downstream consumers have no way of knowing how said file was created since CameraFile::accesstype is not exposed via getters, so users can't know who the data belongs to unless they were the ones to create the file too.

Aside from fixing those usages, I think it would be better to deprecate this function and instead provide one that always stores a data pointer internally, and consistently frees it on gp_file_free.

msmeissn commented 2 years ago

I need to go over it when i have some quiet hours. :/ I cannot easily get rid of this old API and its memory ownage pattern is weird as you saw. Usually the library always "owns" the memory.

Sija commented 3 hours ago

@msmeissn Would you have time to revisit this?