Open xsdg opened 2 months ago
For general reference, the prototype implementation was created in this sequence of commits, from a few years ago. I did get geeqie to compile and run with these changes.
1) Roughly splits filedata.c
into multiple files that are grouped around general functionalities
2) Starts trying to get this to compile and link.
3) Comments out "static" specifier for functions.
4) Adds the "FileData::" class specifier to all functions, to turn them into class methods
5) Defines all filedata functions as class methods.
6) Adds missing method declarations, and moves some typedefs into typedefs.h
7) Adds missing FileData:: class specifiers for functions that return a pointer
8) Working on getting things to compile, and deciding which things should be static or non-static
9) Creates infrastructure to be able to call class methods as callbacks
10) Working out access patterns in order to get linking working.
11) match up declaration and definition of static class members
12) Creates a new filedata.c
that translates between the original C-style FileData API and the C++ class-style API
13) Making progress on getting things to link.
14) OMG it links!
15) IT RUNS!!!
As far as strategy this time, I think I'm going to try to wrap all of the filedata functions into a single big class to begin with, and then work on splitting things out of that class into separate files and/or classes. The pieces of filedata.c
are so tangled (and depend so much on static
functionality) that it's impossible to split things apart without breaking them until the C++ access control is in place anyway.
So I see these as the main steps:
1) Create the FileData class
1) Move filedata.cc
to filedata/filedata.cc
. Put everything in that file inside of a big FileData class
2) Update wherever the FileData
struct is defined so that it also includes the public methods that are now encapsulated
3) Create a new filedata.cc
which trampolines the C-style calls to the C++-style class method calls.
4) Turn any callback into a static class method to avoid needing to jump through method-versus-function calling hoops.
2) Create an initial unit test
1) Sample unit tests from prototype
2) Examples of unit testing private functions from prototype
3) Start separating out functionality into separate files
1) This should just amount to moving functions at this point. Because we stopped using static functions during phase 1, we no longer have constraints around functions all being defined in the same translation unit (i.e. file)
2) Part of this process should also involve trying to understand how the files relate to each other, and figuring out what would make sense as a private API between each file
4) Start turning files into their own sub-classes
1) Here's an example from the prototype
2) The goal is to further encapsulate the dependencies for each group of functionality, so, for example, every bit of code in FileData no longer depends on UI code.
5) Consider encapsulating globals into some kind of singleton object that can be dependency-injected for unit tests.
1) This would go a long way towards allowing us to reliably validate things like corner-case refcounting behaviors.
2) Beyond that, this would facilitate adding debugging features directly to that object, to make it easier to understand and fix bugs, and to validate those bugfixes.
I was working on milestone 5 and ran into a conundrum. Gonna write it out here in case other folks have ideas (especially @qarkai, since this fundamentally a C++ lifetime management question):
I decided to approach this by creating a FileDataContext
struct (which holds global_file_data_count
, file_data_pool
, and file_data_planned_change_map
). I also created an IFileDataContext
interface (which currently just specifies virtual FileDataContext &context() = 0;
, which derived classes need to implement), and a GlobalFileDataContext
singleton that derives from IFileDataContext
:
struct FileDataContext {
#ifdef DEBUG_FILEDATA
gint global_file_data_count = 0;
#endif
GHashTable *file_data_pool = nullptr;
GHashTable *file_data_planned_change_hash = nullptr;
};
class IFileDataContext {
public:
virtual FileDataContext &context() = 0;
};
class GlobalFileDataContext : public IFileDataContext {
private:
static GlobalFileDataContext get_instance() {
…
public:
FileDataContext &context() override { return context; }
private:
std::mutex instance_lock_;
GlobalFileDataContext *instance_ = nullptr; // GUARDED_BY(instance_lock_);
FileDataContext context;
};
class FileData {
private:
FileData() = delete;
FileDataContext *context = nullptr; // TODO(xsdg): Move to bottom of class.
…
The file_data_planned_change_remove
function
context->file_data_planned_change_hash
, which means that it needs to be a FileData
instance methodfile_data_unref(fd)
, which means that it needs to not be an instance method, so that it can destruct the FileData
when appropriateOr, more broadly, how should we deal with cases where code within FileData
itself might remove the last reference to itself? (This is a possibility anywhere that file_data_unref
is called from within src/filedata/filedata.cc
).
file_data_unref
asynchronously. That has all the challenges of doing things asynchronously (like ordering may no longer be guaranteed), but would obviously solve this particular issuefriend
class of FileData
, which is no longer part of a FileData
instance. That said, I think if a FileData
method is in the call stack, that would still be an issue.Actually, I might be wrong. It might be the case that this Just Works (although it might also be very error-prone). See https://isocpp.org/wiki/faq/freestore-mgmt#delete-this.
Also, note that the guarantees from the FAQ above explicitly specify that they only hold when using delete
after using new
for allocation, which we don't do. But also, as a side experiment, I already have a branch where FileData
uses new
/delete
instead of g_new0
and g_free
, so that requirement wouldn't be too difficult to manage if it's actually needed.
AFAIU FileDataContext
is kind of storage of FileData
. Probably file_data_planned_change_remove()
should be part of FileDataContext
.
Forgot to annotate the PRs, but these PRs were relevant to this issue:
FileDataRef
RAII object mentioned above
Currently, FileData is implemented as
filedata.h
(295 lines) andfiledata.cc
(3,581 lines).filedata.h
declares 102 functions, andfiledata.cc
also includes 57 static functions.The purpose of this bug is to lay out the desired end state, since it will take jumping through some hoops to actually get there from how things are implemented right now, while keeping everything functioning during the refactor.
The ideal end state has the following traits: 1) Fundamentally distinct groups of functionality are split into separate implementation files. An old prototype of this refactor created
changeinfo.cc
,core.cc
,filelist.cc
,filter.cc
,metadata.cc
,sidecar.cc
,sidecar_change_info.cc
, andutil.cc
. That seems like a reasonable place to start. 2) FileData becomes a class, in order to take advantage of C++ class access control features. 3) static functions are generally replaced with protected functions. There may be some functions that still make sense to live in an anonymous namespace, but in general, functions should be unit-testable. 4) A shim layer is created to translate between the historical C-style FileData API (things likefile_data_new_dir(const gchar *path_utf8)
) and a more modern C++-style API (something likeFileData::new_for_dir(const gchar *path_utf8)
5) Globals (likestatic GHashTable *file_data_pool
) should at the very least become protected members (so that they're no longer accessible outside of theFileData
class), and should ideally be encapsulated in some way, so that unit tests don't interfere with each other and to make it easier to make things thread-safe.