The GetData Project is the reference implementation of the Dirfile Standards, a filesystem-based, column-oriented database format for time-ordered binary data.
It would appear that SIE encoded fields can not be read safely in multiple threads. I at first came across this problem through the python bindings but I have reproduced it in C. Using the following code:
#include <getdata.h>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void* read_data(void* arg) {
DIRFILE* df = (DIRFILE*)arg;
double* data = (double*)malloc(1000 * sizeof(double));
int res;
for (int i = 0; i < 100; i++)
{
res = gd_getdata(df, "test", 0, 0, 1000, 0, GD_FLOAT64, data);
}
printf("res = %d\n", res);
free(data);
return NULL;
}
int main(int argc, char **argv)
{
DIRFILE* df = gd_open("data", GD_RDONLY);
if (df == NULL) {
fprintf(stderr, "Failed to open dirfile\n");
return 1;
}
pthread_t thread1, thread2;
pthread_create(&thread1, NULL, read_data, (void*)df);
pthread_create(&thread2, NULL, read_data, (void*)df);
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
gd_close(df);
return 0;
}
and a dirfile written by the python bindings with the following format file:
# This is a dirfile format file.
# It was written using version 0.11.0 of the GetData Library.
# Written on Tue Nov 5 15:46:14 2024 UTC by simon.
/VERSION 10
/ENDIAN little
/PROTECT none
/ENCODING sie
test RAW FLOAT64 1
/REFERENCE test
I consistently get segfaults in the fread in sie.c_GD_Advance (or double-free errors at free(databuffer) in _GD_DoRaw).
These errors do not show up with unencoded fields so I assume its something to do with SIE although I only had a very cursory look inside.
It would appear that SIE encoded fields can not be read safely in multiple threads. I at first came across this problem through the python bindings but I have reproduced it in C. Using the following code:
and a dirfile written by the python bindings with the following format file:
I consistently get segfaults in the
fread
insie.c
_GD_Advance
(or double-free errors atfree(databuffer)
in_GD_DoRaw
).These errors do not show up with unencoded fields so I assume its something to do with SIE although I only had a very cursory look inside.