Open ATrivialAtomic opened 3 months ago
Hi, I would have some remarks to the pull request.
Most importantly, the pull request is a bit huge compared to functional changes - it almost rewrite both files (2/3 of code is touched), even the unrelated parts of code. Also the commits can be easily squashed into one, because there is first big commit and then just smaller changes of this one.
As there are together functional changes, refactoring and code movement, it is really hard to check what has been exactly changed, so it almost cannot be audited for regressions etc. (I personally usually separate even refactoring and functional commits.) Would it be possible to reduce the extent? You can look (or use) eg. this commit - it modifies your changes in a way that it at least leaves the code in place and reverts few unrelated changes. The advantage should be obvious if you look at git diff, that your changes are more readable and can be reviewed.
For me will be acceptable if you'd squash all to a single commit (including the suggested changes). If you'd then like to do some further changes, I'd perhaps prefer separate pull requests. I'll sum up my opinions about the non-essential changes:
Last, I see problematic the conditional compilation with HAVE_CUDA - in the original code the only CUDA depending code is the CUDA host allocator. Actually the proposed pull request requires CUDA toolkit for the CUDA codec variant to be used. That was actually not required before - without CUDA, the codec worked (just the buffers was not in the DMA transfer capable region). I don't know exactly our uses' use cases, but I can imagine that someone compiles UG without having the CUDA toolkit - it may not introduce much performance penalty. But don't worry about this particular case, I can fix this after the merge by myself.
Hi @MartinPulec!
Thanks for taking the time to review this pull request. This is my first one, so my apologies if it's a bit unconventional. I'll be sure to keep your notes in mind, regarding separating functional and refactoring changes, as well as squashing minor changes into one commit, for the next PR.
I read through your commit and agree, it's significantly more readable and rolls back some of the non-essential changes. I'm happy to use that as my starting point to re-review, test, and submit if you'd prefer I do that.
I had one question that I wanted your input on since I was going back-and-forth with myself about it. Do you agree with :platform=<cpu,cuda>
or should it be named something else? I'm not tied to the platform keyword, but it's what made sense to me when I was testing out how best to name and implement.
Regarding the HAVE_CUDA
conditional, thanks for catching that. Looking back, all of my testing on Linux used systems with the CUDA toolkit, so I missed that part of my testing. If you'd prefer to fix that after the merge, I can leave all those conditionals as-is.
In terms of the non-essential changes, here's some of the reasoning with why I made those changes. I'm happy to submit separate pull requests if you think any of them really make much of a difference -- don't want to waste your time submitting non-important PRs.
find_if
- No specific reason other than I thought easier to read than the range-based for loop when combined with a lambda. Agreed, not particularly important!log_msg
- I used libavcodec.cpp in compress and decompress as my reference since it was the most recently updated at the time of me working on this PR. I noticed that log_msg
was used more than MSG()
, so I standardized on that. I'm not particularly tied to this change, just wanted to try and use something that seemed a bit more recent and standardized. Using %s for the module prefix is nice because it reduced chances I'd mistype and put [CMPTO J2K enc] instead of [CMPTO J2K dec], but I can definitely see benefits to enforcing the prefix usage. using std::*
- Other than the namespace std, there was no reason I removed these other than it being a matter of opinion. Not particular important either!It is totally fine, I believe that your code is great, but the only problem is really that I want to leave the things traceable, for which it is good to separate functional and non-functional changes, I believe. For the other changes it depends - some are ones that I'd agree, the others I would as well, but I would change it just when changing that code anyways.
I had one question that I wanted your input on since I was going back-and-forth with myself about it. Do you agree with :platform=<cpu,cuda> or should it be named something else? I'm not tied to the platform keyword, but it's what made sense to me when I was testing out how best to name and implement.
I believe that the keyword platform is totally fine. I've just looked into the headers and at Comprimato, they denote this as "technology" (CMPTOTECHNOLOGY{CPU,CUDA}. But both words sound more or less equivalent in this content.
Regarding the HAVE_CUDA conditional, thanks for catching that. Looking back, all of my testing on Linux used systems with the CUDA toolkit, so I missed that part of my testing. If you'd prefer to fix that after the merge, I can leave all those conditionals as-is.
I have no problem doing it by myself but I don't insist on that. Anyways, if you decide to do it by yourself, you can look at the above code comment.
find_if
Sure - I've no particular objections against that. If you wish, I have no problem but it will be nicer if it was in a separate commit. Anyways, just in a case, it doesn't need the std::array
necessarily, std::begin
/std::end
to with the C-array as the argument would work as well.
log_msg
- I used libavcodec.cpp in compress and decompress as my reference since it was the most recently updated at the time of me working on this PR. I noticed that log_msg was used more than MSG(), so I standardized on that. I'm not particularly tied to this change, just wanted to try and use something that seemed a bit more recent and standardized. Using %s for the module prefix is nice because it reduced chances I'd mistype and put [CMPTO J2K enc] instead of [CMPTO J2K dec], but I can definitely see benefits to enforcing the prefix usage.
Ok, well, no serious problem with this, indeed. But just a small explanation - I've actually started using MSG()
later on - it is actually a macro and if you look at its definition, it actually enforces the module name implicitly. On the other hand, it also forces the MOD_NAME to be the C string literal. fixed
I understand that using the macro(s) is slightly controversial in C++ (even these MOD_NAME, which clang-tidy wants to be replaced with a constexpr var), so I don't insist on that, either.
- using std::* - Other than the namespace std, there was no reason I removed these other than it being a matter of opinion. Not particular important either!
Sure, I've consulted it with a colleague on Friday and he also advocated the use of the identifiers including the std namespace. We may consult it later to unify the style but it just haven't been so far (UltraGrid is written in C historically and the C++ coding standards are not much defined).
To sum up the points 1-3, as there is no rule defined, feel free to choose. I also understand that you may want to make this consistent across the file. Provided that it will be in a separate commit, we'd accept it.
Hi @MartinPulec!
I've made the requested changes, if you'd like to review them and provide feedback. Here's a summary of what has been done.
bool supports_cmpto_technology(int)
and replaced #ifdef HAVE_CUDA
directives in all cases except for cmpto_j2k_enc_cuda_host_buffer_data_allocator
MSG()
instead of log_msg()
for message reporting in most cases.using allocator =
in video_compress file to using cuda_allocator =
to match using cpu_allocator =
naming and to be explicit about what type of allocator is being used when std::unique_ptr<video_frame_pool>
is initialized.One thing I noticed during the MSG()
refactor is that fmt is using "%s", which means that MSG()
calls will require a space at the head of the fmt string to print a space between MOD_NAME
and the rest of the arguments. Should I add space to the head of all the MSG()
calls, or is there a chance "%s" will be changed to "%s " in the MSG()
call?
MSG(INFO, "no space at head")
= [Cmpto J2K enc.]no space at head
MSG(INFO, " space at head")
= [Cmpto J2K enc.] space at head
Thanks!
This pull request implements CPU compression and decompression for the Comprimato J2K Codec.
In the interest of leaving the current implementation unchanged for users already using Comprimato with CUDA, CUDA is still the default platform selected for compression and decompression when CUDA libraries are detected during UltraGrid compilation. This means that
-c cmpto_j2k
will default to using CUDA for compression. A system decompressing a cmpto_j2k stream will default to using CUDA for decompression. This has been guaranteed by the use of#ifdef HAVE_CUDA
directives.If UltraGrid is compiled without CUDA,
-c cmpto_j2k
will default to using CPU for compression. A system compiled without CUDA will also default to CPU for decompressing a cmpto_j2k stream.To allow for the selection of either CUDA or CPU, a new :platform= option has been included when CUDA libraries are detected and UltraGrid is compiled. If no CUDA libraries are detected at compile time (ex. MacOS), the :platform= option will be removed and CPU will be the assumed default with no option for the user to change.
Platform default (CUDA or CPU) is determined at compile time using
#ifdef HAVE_CUDA
directives in thestate_video_compress_j2k
andstate_video_decompress_j2k
structs.Additions to Compression Options and Decompression Parameters
To support CPU usage for compression and decompression, additional options and parameters have been added.
J2K Compression Options Syntax | CUDA
J2K Compression Options Syntax | CPU
New J2K Compression Options:
:platform=<cuda,cpu>
-c cmpto_j2k
will imply-c cmpto_j2k:platform=cuda
. Calling-c cmpto_j2k:platform=cpu
will explicitly select CPU compression over CUDA.:thread_count=
bool initialize_j2k_enc_ctx()
:img_limit=
bool initialize_j2k_enc_ctx()
:lossless
New J2K Decompression params:
j2k-dec-cpu-thread-count=
bool initialize_j2k_dec_ctx()
.j2k-dec-img-limit=
bool initialize_j2k_dec_ctx()
and will set img-limit = cpu-thread-count if exceeded.bool initialize_j2k_dec_ctx()
.j2k-dec-use-cpu
j2k-dec-use-cuda
-c cmpto_j2k:help
print out when CUDA is Present-c cmpto_j2k:help
print out when only CPU is Present--param help
when CUDA is Present--param help
when only CPU is PresentChanges to
state_video_compress_j2k
andstate_video_decompress_j2k
construction and struct membersIn addition to adding CPU support for compression and decompression, I've made changes to the way
state_video)compress_j2k
andstate_video_decompress_j2k
classes are constructed, throwing exceptions if they are unable to be initialized. This has removed the need forgoto
statements and explicitdelete
of the state_j2ks by thej2k_compress_init()
andj2k_decompress_init()
functions.Parsing of compression options have been offloaded to private member,
void parse_fmt(const char*)
instruct state_video_compress_j2k
.Parsing of decompression parameters has been offloaded to private member
void parse_params()
instruct state_video_decompress_j2k
.If help is called during
-c cmpto_j2k:help
, theHelpRequested()
exception will throw and be caught byj2k_compress_init()
, resulting instatic_cast<module*>(INIT_NOERR)
Once parsing has completed, private members
initialize_j2k_enc_ctx()
andinitialize_j2k_dec_ctx()
are called and attempt to complete the final initialization of the J2K context.Any errors during this process will result in a thrown exception that will be caught by
j2k_compress_init()
andj2k_decompress_init()
, resulting in aNULL
return.Compress Exceptions Include:
HelpRequested()
InvalidArgument()
UnableToCreateJ2KEncoderCTX()
Decompress Exceptions Include:
UnableToCreateJ2KDecoderCTX()
To account for differences in the video_frame_pool initialization when using CPU and CUDA platforms during cmpto_j2k compression, the
video_frame_pool pool
has been changed tostd::unique_ptr<video_frame_pool> pool
to allow for reconfiguration during state_video_compress_j2k construction.Changes to
j2k_compress_init
andj2k_decompress_init
Since the j2k_compress and j2k_decompress constructors will throw if there is an error with any of the options or parameters, _init functions are responsible for catching those errors and returning what is needed by the module.
New j2k_compress_init implementation
New j2k_decompress_init implementation
Additions of enums and structs to allow easy selection of platform to use and future expansion
enum j2k_compress_platform
andenum j2k_decompress_platform
have been created to allow easy selection of the platform to use. These options currently include:NONE
CPU
CUDA
struct j2k_compress_platform_info_t
has been created to hold a friendly name (ex. "cpu") and its corresponding j2k_compress_platform type.The function
j2k_compress_platform get_platform_from_name(std::string)
has also been added with the purpose of searchingconstexpr auto compress_platforms = array{};
for friendly names and returning an associated platform.These enums and struct are built with the consideration that OpenCL, since it's also supported by Comprimato, can be implemented in the future.
General Changes
struct Codec
and codec information for both compress and decompress have been changed to constexpr arrays.constexpr auto codecs = std::array{ Codec{UYVY, CMPTO_422_U8_P1020, nullptr}, Codec{v210, CMPTO_422_U10_V210, nullptr}, Codec{RGB, CMPTO_444_U8_P012, nullptr}, Codec{BGR, CMPTO_444_U8_P210, nullptr}, Codec{RGBA, CMPTO_444_U8_P012Z, nullptr}, Codec{R10k, CMPTO_444_U10U10U10_MSB32BE_P210, nullptr}, Codec{R12L, CMPTO_444_U12_MSB16LE_P012, rg48_to_r12l}, };
Standardized on using
log_msg()
for message reporting, rather than MSG()Removed some C-style casts and replaced with
static_cast<T>
Made some
#define
statementsconstexpr
Removed some
using
directivesRefactored
void print_dropped
to be platform-aware in reporting hintsMoved #include, #define, constexpr, structs, functions, predeclarations around.
struct Opts
created instead of using anonymous struct and split into General, CUDA, and CPU options. Availability of these options are determined with `#ifdef HAVE_CUDA# directives.Testing
Testing of this implementation has been done on: