vgmtrans / vgmtrans

VGMTrans - a tool to convert proprietary, sequenced videogame music to industry-standard formats
812 stars 78 forks source link

Refactor based on sykhro's old refactor branch #469

Closed mikelow closed 1 month ago

mikelow commented 1 month ago

This is a refactoring based on portions of sykhro's refactor branch. The branch is 4 years stale, so it took some work to integrate it into the current codebase. This PR does not incorporate all of the changes from refactor, though there are surely other valuable changes to bring over.

The major changes:

If you're reviewing on github, probably easiest to look at this commit which includes almost all changes sans the mio lib files.

Addresses #357

This branch includes spdlog as a git submodule. Discussion on this below.

Discussion

With some exceptions I did not thoroughly go through the refactor changes to the files in the format directory. I did get all of the Scanners working, and updated all of the logging code to use the new macros. In general, I tried not to make any significant changes outside the work in refactor. There are a couple exceptions, though:

spdlog

I brought this in based on this suggestion here. I retained the LogItem interface for communicating logs via Root. In hindsight, I'm barely making using of this library, and I wonder if we should remove it and just use fmt.

git submodules

I included spdlog as a git submodule. I'm not sold on the idea that it should be included this way. I'm also not sold on the inclusion of spdlog at all. The nice thing is that the git submodule hasn't mucked up the git repo and its history with over a megabyte of files, and we can easily remove it and add a vendored version if we choose.

One of the future projects I'm slowly working on is replacing BASS with a JUCE based virtual instrument, and I am convinced the most practical way to include that project will be as a git submodule, since development would largely be done in tandem. I'm open to other ideas. I don't like the git submodule interface, but after research, my sense is that git submodules is the least worst option. Although, I haven't looked at C++ package managers like Conan.

VGMExport

I've diverged from the refactor branch a bit here. Instead of defining the conversion::SaveAsX() methods within the cpp files of the applicable class (like VGMInstrSet or VGMSampColl), I've centralized them into VGMExport.cpp. It all still seems a bit messy, especially given the statefulness of VGMSeq's MIDI conversion which requires it be done as a class method.

pch.h

I noticed you removed many (perhaps all?) inclusions of this header. I went the whole nine yards and removed it entirely, figuring this is an artifact from the WTL days. Hopefully this wasn't a mistake. My understanding is that if we want to use precompiled headers, we can specify them via cmake; they do not need to be explicitly included in each file.

Motivation and Context

Two of the most glaring problems with the codebase are the clunky RawFile interface, particularly the abominable DataSeg class it depends on, and old C++ string handling. This largely solves both of those problems, in addition to others. Hopefully this will also open the door to thread-safe file i/o that will enable background threaded loading.

How Has This Been Tested?

Types of changes

Checklist:

sykhro commented 1 month ago

This is a refactoring based on portions of sykhro's refactor branch. The branch is 4 years stale, so it took some work to integrate it into the current codebase. This PR does not incorporate all of the changes from refactor, though there are surely other valuable changes to bring over.

Thank you for the effort! I hope it wasn't painful

  • targeting C++20

Beware that this changes the minimum version for developing on macOS.

  • VGMFiles are stored in Root as a vector of std::variant<VGMSeq, VGMInstrSet, VGMSampColl, VGMMiscFile> instead of VGMFile*. I've declared this variant type VGMFileVariant. This obviates the need for a file type enum and grants more type safety.

This is definitely an improvement and we could apply it to other parts of the codebase as well. std::variant is a bit clunky so I might propose type-erased solutions in the future (e.g. midi events without subclassing, etc)

Discussion

With some exceptions I did not thoroughly go through the refactor changes to the files in the format directory. I did get all of the Scanners working, and updated all of the logging code to use the new macros. In general, I tried not to make any significant changes outside the work in refactor. There are a couple exceptions, though:

There might be hacky support for GBA collections! I don't remember if it was in that branch.

spdlog

I brought this in based on this suggestion here. I retained the LogItem interface for communicating logs via Root. In hindsight, I'm barely making using of this library, and I wonder if we should remove it and just use fmt.

spdlog is battle-tested, type-safe, thread-safe. It can also support async loggers which would be very handy with our future parallelism plan. I would avoid to roll our own, given its feature set.

git submodules

I included spdlog as a git submodule.

Ah, the pain. I don't see us updating it often so it's probably okay.

One of the future projects I'm slowly working on is replacing BASS with a JUCE based virtual instrument, and I am convinced the most practical way to include that project will be as a git submodule, since development would largely be done in tandem.

While it sounds good in theory, co-dependent repos via git submodules is hell. You constantly end up with weird state, accidentally commit inside of the wrong folder, have people struggle with updating them (if I update it, you have to invoke git submodule update ... manually after you pull). It also makes git an integral part of the build system, effectively creating a "hole". If development is really happening in tandem, then a monorepo isn't a bad idea. Many large-scale projects have migrated to this approach/use version control systems like Perforce that do this seamlessly.

Although, I haven't looked at C++ package managers like Conan.

Conan is part of my daily practice. It's great if you are consuming many libraries in different configs and if you are providing a library for others to use. However, it does require a bit of setup and proper versioning. And python as a mandatory dependency.

VGMExport

I've diverged from the refactor branch a bit here. Instead of defining the conversion::SaveAsX() methods within the cpp files of the applicable class (like VGMInstrSet or VGMSampColl), I've centralized them into VGMExport.cpp. It all still seems a bit messy, especially given the statefulness of VGMSeq's MIDI conversion which requires it be done as a class method.

This was my first step in the direction of type erasure for saving files. The idea was to leverage free functions and templates to have a "VGMUIFileU" interface that provides:

pch.h

I noticed you removed many (perhaps all?) inclusions of this header. I went the whole nine yards and removed it entirely, figuring this is an artifact from the WTL days. Hopefully this wasn't a mistake. My understanding is that if we want to use precompiled headers, we can specify them via cmake; they do not need to be explicitly included in each file.

Thank you for this. One of my frustrations with the codebase is that headers got a bit out of hand and changes in one often trigger the rebuild of almost everything. Eliminating pch.h, which had a ton of stuff in it, was a step in fixing it. I did some more header cleaning in the past with some tools I made, but I don't remember if it was on this branch.

Anyway, consider it approved. As silly as it is for me to approve my own changes :p

mikelow commented 1 month ago

Beware that this changes the minimum version for developing on macOS.

That's ok with me. Onward and upward.

This is definitely an improvement and we could apply it to other parts of the codebase as well. std::variant is a bit clunky so I might propose type-erased solutions in the future (e.g. midi events without subclassing, etc)

Agreed that it's clunky. Do you know of any examples to follow for type-erasure? I see articles discussing it, but it would be nice to see a real-world example of a (relatively) clean implementation.

There might be hacky support for GBA collections! I don't remember if it was in that branch.

It is in refactor, and I took note of it! Excited to try it out. I figure this can be a separate PR.

spdlog is battle-tested, type-safe, thread-safe. It can also support async loggers which would be very handy with our future parallelism plan. I would avoid to roll our own, given its feature set.

Of this I have no doubt. It just seems that the LogManager class as-written is barely utilizing spdlog and that it mostly serves as a wrapper for fmt. Perhaps I'm missing something. I wondered if you had a different concept in mind for what LogManager should look like.

Ah, the pain. I don't see us updating it often so it's probably okay.

I'm not wholly opposed to vendoring. Git submodule seemed a non-committal solution, since it can be easily reversed with minimal affect on git history.

While it sounds good in theory, co-dependent repos via git submodules is hell. You constantly end up with weird state, accidentally commit inside of the wrong folder, have people struggle with updating them (if I update it, you have to invoke git submodule update ... manually after you pull). It also makes git an integral part of the build system, effectively creating a "hole". If development is really happening in tandem, then a monorepo isn't a bad idea. Many large-scale projects have migrated to this approach/use version control systems like Perforce that do this seamlessly.

Yeah, this is a headache of a problem. I really want the virtual instrument project to be its own standalone thing - a sibling repo of vgmtrans. A monorepo therefore seems like a non-starter. I guess I need to think on it further.

Is using a package manager like Conan all that different from git submodules? In my experience outside of C++, it was basically the same thing - expect to run a command to update dependencies after every pull.

This was my first step in the direction of type erasure for saving files.

Again, let me know if there's an example you can point to.

Thank you for this. One of my frustrations with the codebase is that headers got a bit out of hand and changes in one often trigger the rebuild of almost everything.

Glad to do it, though I feel like this is still a problem.