Closed chaburkland closed 4 months ago
@chaburkland : very happy to see this refactoring!
One initial comment: I would favor leaving in the existing inline
usage for a number of reasons.
inline
is a suggestion to the compiler that may not always be observed. Thus, over inline
ing seems fine. @chaburkland : one more:
Can we name file_parsing.c
something more specific like delimited_to_arrays.c
? I guess there might be a name conflict, but a more specific name seems appropriate as, after this change, we are likely to create new, separate files for any other file parsing routines we might add.
@flexatone
Thanks for the push to rename. I wasn't sure what to call it, but delimited_to_arrays
makes the most sense.
As for inline
, I went back through and discovered I marked a lot more functions inline
than previously.
I wasn't able to mark these 4 functions inline
(they were previously), because they are defined in utilites.h
:
AK_build_pair_ssize_t
AK_build_slice
AK_slice_to_ascending_slice
AK_nonzero_1d
I think (at least an AI told me) if we put the actual implementation (not just the header) in the .h
file, we can inline some of those remaining functions, at least the smaller ones more natural for inlining.
@chaburkland : any thoughts on the above (putting implementations in header files for inlining)?
Also, do you think we should merge this before merging https://github.com/static-frame/arraykit/pull/160, or vice versa?
A first pass at splitting up
_arraykit.c
into multiple source files.I split it up as such:
utilities.c
: defines internal arraykit functions/macros that are used by other functions across multiple modulesmethods.c
: defines the public arraykit functionsblock_index.c
: defines allBlockIndex
related codearray_go.c
: defines allArrayGO
related codefile_parsing.c
: defines all code related to delimited file parsingtri_map.c
: defines allTriMap
related code_arraykit
: defines thearraykit
public interfaceAll source files, except
_arraykit.c
have related header files.A main note Many functions are no longer
static
orstatic inline
. Through splitting this code up, I gained a deeper understanding of what those keywords mean.static
means the function defined is specific to the scope of the source file it lives in. As such, it is non-sensical to define astatic
function in a header file, because it is essentially defining an interface that cannot ever be referenced outside the header file itself. As such, the public functions/typedefs in each of the header files are not static, and their corresponding definitions in the source files are also not static.inline
is a way to speed up performance at the cost of binary size. With inlining all possible functions, the binary sizes (on my machine) increases from:841,184 KBs
->1,041,664 KBs
I'm testing the performance diff now