JuliaMath / FixedPointNumbers.jl

fixed point types for julia
Other
80 stars 33 forks source link

[Feature request] Add support for reading FixedPointNumbers #251

Open tlnagy opened 3 years ago

tlnagy commented 3 years ago

It would be great if FixedPointNumbers could add support for the read function, i.e. read(io, N4f12), that would handle the padding and other issues. For example, given a 24-bit long stream, the output of the function would be a vector of two N4f12s.

This recently came up in https://github.com/tlnagy/TiffImages.jl/issues/58, but I feel like this package is a better home for such logic. I could make an attempt at an implementation if people feel like this is the best location for it.

kimikage commented 3 years ago

I agree that such a feature would be useful. However, there are a few problems.

First, N4f12 (Normed{UInt16,12}) and Normed{UInt12,12} (which is currently not well supported) are different and need to be distinguished. It is ambiguous whether you want to use Normed{UInt12,12} as-is, extend it to N4f12, extend it to N0f16, or convert it to a completely different type.

Secondly, I don't think it's a good idea to overload read/write for general IO streams. Since reading/writing data that is not byte-aligned is context-full, that can easily give rise to bugs. So, we need to define a specialized IO type or define different functions. However, whether it should be done in this package is debatable.

Perhaps we need to make a policy about #247 first, but I don't have a definite answer for that yet. So, for the time being, I think it is better to implement the feature in downstream packages.

tlnagy commented 3 years ago

With read you provide the concrete type that you cast to, I don't see how inference could be confused.

Btw, what is the UInt12 type? It doesn't seem to be a base Julia type.

kimikage commented 3 years ago

With read you provide the concrete type that you cast to, I don't see how inference could be confused.

However, there is no API to specify the source (raw) type and the start bit. Probably, only "one" N4f12 can be read from a 24-bit long stream.

what is the UInt12 type?

Julia can't handle types with bit widths other than multiples of 8, so #247 includes the problem of how to define them.

If you already have a good solution, then PR is welcome.

johnnychen94 commented 3 years ago

Since N4f12 actually assumes 16-bit io string, read(io, N4f12) will have alignment issues to 24-bit io string, so until #247 is supported, it is not possible to bring unambiguous read(io, N4f12) support IMO. As @kimikage said, you can't just hardcode the reinterpretation of N4f12 as your 12-bit format.

Two stages are involved here: 1) read bit string as Vector{Bool} 2) convert the Vector{Bool} to Vector{N4f12}.

Hence currently it should be supported in downstream TiffImages to provide convert_read(io, T, n_bytes) that jointly read the io string and reinterpret/convert it to T eltype. For 12-bit number, the minimal n_bytes should be 3 (48-bits).

Maybe we should propose this convert_read in upstream Julia Base.

tlnagy commented 3 years ago

Ah, I see the concern. I guess my vision was that, since there are only 12 information bits in N4f12, that read would read in 12 bit chunks (not 16!). Naturally, the struct would then pad the 12 information bits with 4 padding bits for the N4f12 type. This would only work if the length of the input string is divisible by 12 (the number of information bits).

Since N4f12 actually assumes 16-bit io string, read(io, N4f12) will have alignment issues to 24-bit io string, so until #247 is supported, it is not possible to bring unambiguous read(io, N4f12) support IMO.

It assumes a 16-bit backing type, i.e. it's a wrapped UInt16, but what I'm imagining is reading a 24-bit string as three UInt8s, figuring out the bit math to get two 12 bit values from that and then converting to N4f12 which would actually be 32-bit since they are backed by UInt16s.

timholy commented 3 years ago

You might want to create a NfFormat(io) wrapper that would encode all your choices about how you want the IO operation to work. Depending on implementation size it might be better as an add-on package.