JuliaImages / QRCoders.jl

Creating QR Codes within Julia
https://juliaimages.org/QRCoders.jl/
MIT License
68 stars 11 forks source link

Add support of Kanji Mode #18

Closed RexWzh closed 2 years ago

codecov[bot] commented 2 years ago

Codecov Report

Merging #18 (f02ba74) into master (fddca45) will increase coverage by 2.17%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #18      +/-   ##
==========================================
+ Coverage   97.53%   99.70%   +2.17%     
==========================================
  Files           3        4       +1     
  Lines         324      337      +13     
==========================================
+ Hits          316      336      +20     
+ Misses          8        1       -7     
Impacted Files Coverage Δ
src/QRCoders.jl 100.00% <100.00%> (+1.57%) :arrow_up:
src/encode.jl 100.00% <100.00%> (ø)
src/errorcorrection.jl 98.61% <100.00%> (+3.94%) :arrow_up:
src/matrix.jl 100.00% <100.00%> (+1.63%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

RexWzh commented 2 years ago

Main change

  1. use zero, unit instead of Poly([0]), Poly([1])
  2. move original tests of QRCode.jl into test/tst_overall.jl

For a highly-optimized algorithm where bandwidth is the bottleneck, UInt8 vs Int can make a lot of difference in performance, e.g., https://github.com/JuliaImages/ImageMorphology.jl/pull/102#issuecomment-1165297247. But for QRCoders, it might be too early for this to make a difference.

I did some tests about type allocations:

johnnychen94 commented 2 years ago

Looks like there's a misunderstanding here about the type annotation on performance. I'll write up a few things and share it with you in private later this week.

RexWzh commented 2 years ago

I'm a little confused about the purpose of key word and predefined variable.

For example,

function qrcode( message::AbstractString
       , eclevel::ErrCorrLevel = Medium()
       ; version::Int = 0
       , compact::Bool = false )

Would it be better to write

function qrcode( message::AbstractString
       ; eclevel::ErrCorrLevel = Medium()
       , version::Int = 0 
       , compact::Bool = false)
johnnychen94 commented 2 years ago

Choosing between keyword parameter and positional argument is a tradeoff between convenience and explicitly -- this is more of a design taste.

Because keyword parameter doesn't participate in multiple dispatch, it has to be type stable regardless of different value choices. For instance,

# A toy example

julia> struct M1 end

julia> struct M2 end

julia> f(; m) = m()
f (generic function with 1 method)

julia> g(m) = m()
g (generic function with 1 method)

julia> @code_warntype f(m=M1) # unstable
MethodInstance for (::var"#f##kw")(::NamedTuple{(:m,), Tuple{DataType}}, ::typeof(f))
  from (::var"#f##kw")(::Any, ::typeof(f)) in Main at REPL[3]:1
Arguments
  _::Core.Const(var"#f##kw"())
  @_2::NamedTuple{(:m,), Tuple{DataType}}
  @_3::Core.Const(f)
Locals
  m::DataType
  @_5::DataType
Body::Any
1 ─ %1  = Base.haskey(@_2, :m)::Core.Const(true)
│         Core.typeassert(%1, Core.Bool)
│         (@_5 = Base.getindex(@_2, :m))
└──       goto #3
2 ─       Core.Const(:(Core.UndefKeywordError(:m)))
└──       Core.Const(:(@_5 = Core.throw(%5)))
3 ┄ %7  = @_5::DataType
│         (m = %7)
│   %9  = (:m,)::Core.Const((:m,))
│   %10 = Core.apply_type(Core.NamedTuple, %9)::Core.Const(NamedTuple{(:m,)})
│   %11 = Base.structdiff(@_2, %10)::Core.Const(NamedTuple())
│   %12 = Base.pairs(%11)::Core.Const(Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}())
│   %13 = Base.isempty(%12)::Core.Const(true)
│         Core.typeassert(%13, Core.Bool)
└──       goto #5
4 ─       Core.Const(:(Base.kwerr(@_2, @_3)))
5 ┄ %17 = Main.:(var"#f#3")(m, @_3)::Any
└──       return %17

julia> @code_warntype g(M1) # stable
MethodInstance for g(::Type{M1})
  from g(m) in Main at REPL[4]:1
Arguments
  #self#::Core.Const(g)
  m::Core.Const(M1)
Body::M1
1 ─ %1 = (m)()::Core.Const(M1())
└──      return %1

For our case, if we can ensure that output is always Matrix{Bool}, we're good to use keyword parameter.

We could also provide qrcode(message; error_correction=:medium) with a small type-unstable function:

const _error_correction_map = Dict(:medium=>Medium(), ...)
_normalize_error_correction(s::String) = _normalize_error_correction(Symbol(s))
_normalize_error_correction(s::Symbol) = _error_correction_map[s]
_normalize_error_correction(m::ErrCorrLevel) = m

This is type-unstable, but it only happens at the out-most level so it's an acceptable tradeoff between convenience and performance. (See also the performance tip: kernel function https://docs.julialang.org/en/v1/manual/performance-tips/#kernel-functions)

RexWzh commented 2 years ago

Correct errors in the source code of QRCode.jl

using QRCode ## original repo
## works fine
exportqrcode("000123456", "number.png") # Numeric
exportqrcode("hello world!", "hello.png") # Byte without special unicode
## invalid qrcode
exportqrcode("αβ", "ab.png") # Byte with special unicode
exportqrcode("瀚文", "瀚文.png") # Kanji mode

The following QR-Code is invalid(message µµ using Byte mode). mm

Rectified version: mumu

Error caused by the mixed uses of Byte mode and UTF8 mode.

## line 296 in QRCode.jl
## for Byte mode, `lastindex` should be replaced by `length`
ccindicator = getcharactercountindicator(lastindex(message), version, mode)
## line 156
## replace `Array{UInt8}(message)` by `UInt8.(collect(message))`
function encodedata(message::AbstractString, ::Byte)::BitArray{1}
    bytes = Array{UInt8}(message)
    bin = map(int2bitarray, Array{Int,1}(bytes))
    return vcat(bin...)
end

current:

RexWzh commented 2 years ago

Abstract

  1. Add support of Kanji mode
  2. Separate UTF8 mode from Byte mode (the length method for the two are different)
  3. Minor changes on some of the functions

I summarize the followings that might need to be reviewed before merging this PR. Looking forward to your advice if convenient.

RexWzh commented 2 years ago

matrix.jl -- line149

Here is my question:

# common part
using BenchmarkTools
using QRCoders: empytmatrix

_maskrules = Function[
    (x, y) -> (x ⊻ y) & 1,
    (x, _) -> x & 1,
    (_, y) -> y % 3,
    (x, y) -> (x + y) % 3,
    (x, y) -> (x >> 1 + y ÷ 3) & 1,
    (x, y) -> (x & y & 1) + (x * y % 3),
    (x, y) -> (x & y & 1 + x * y % 3) & 1,
    (x, y) -> ((x ⊻ y & 1) + (x * y % 3)) & 1
]
makemask(matrix::AbstractArray, k::Int)::BitArray{2} = makemask(matrix, _maskrules[k])
matrix = empytmatrix(20)
# original version
function makemask(matrix::AbstractArray, rule::Function)::BitArray{2}
    n = size(matrix, 1)
    mask = falses(size(matrix))
    @inbounds for row in 1:n, col in 1:n
        mask[row, col] = isnothing(matrix[row, col]) && iszero(rule(row - 1, col - 1))
    end
    return mask
end

@btime makemask.(Ref(matrix), 1:8); # 215.263 μs (20 allocations: 10.31 KiB)
# new version
function makemask(matrix::AbstractArray, rule::Function)::BitArray{2}
    n = size(matrix, 1)
    mask = falses(size(matrix))
    @inbounds for row in 1:n, col in 1:n
        if isnothing(matrix[row, col]) && iszero(rule(row - 1, col - 1))
            mask[row, col] = true
        end
    end
    return mask
end

@btime makemask.(Ref(matrix), 1:8); # 131.314 μs (20 allocations: 10.31 KiB)
RexWzh commented 2 years ago

table.jl -- line 11

I use the read function to initialize kanji.

"""
Shift-JIS characters download from: https://www.romhacking.net/documents/179/.
"""
_kanji = split.(split(strip(read("src/Kanji.tbl", String)), '\n'), '=')

"""
Allowed characters for `Kanji()` mode.
"""
const kanji = Dict{AbstractChar, Int}(
  first(j) => parse(Int, i; base=16) for (i, j) in _kanji)

This only works in QRCoders.jl. When I try using QRCoders in QRDecoders.jl, it would throw file path error. So I use include method instead.

My questions are, is it reasonable to initialize variables with text file especially when the file is large, and how to avoid path error when dealing with files.