robotdana / spellr

Spell check your source code
MIT License
34 stars 2 forks source link

Validation for dictionaries #79

Open voidless opened 2 years ago

voidless commented 2 years ago

Hi, my team have encountered expected incorrect behaviour when custom dictionaries are misconfigured (incorrect order, case, incorrect newlines), so I've made a function to find these errors.

The code can be wrapped in a new spellr command to validate dictionaries. Or it can be optionally be called when spell checking (for setups with small dictionaries). Also you can spec test bundled dictionaries like this, if there is no validation currently


def find_duplicates(array)
    prev = nil
    indexes = []
    array.each_with_index do |curr, index|
        indexes.append(index) if prev == curr
        prev = curr
    end
    indexes
end

def find_not_ascending(array)
    prev = nil
    indexes = []
    array.each_with_index do |curr, index|
        indexes.append(index) if prev && curr < prev
        prev = curr
    end
    indexes
end

def find_not_lowercase(array)
    indexes = []
    array.each_with_index do |curr, index|
        indexes.append(index) unless curr == curr.downcase
    end
    indexes
end

error_type = "error"

errors = []
Dir.glob('.spellr_wordlists/*.txt').select do |file|
    next unless File.file? file
    contents = File.readlines(file)

    find_not_ascending(contents).each do |index|
        errors.append "#{file}:#{index+1}: #{error_type}: words must be ordered ascending"
    end

    find_duplicates(contents).each do |index|
        errors.append "#{file}:#{index+1}: #{error_type}: duplicate word"
    end

    find_not_lowercase(contents).each do |index|
        errors.append "#{file}:#{index+1}: #{error_type}: words must be lowercase"
    end

    if contents.count > 0
        if contents.first.length == 1
            errors.append "#{file}:0: #{error_type}: first line must not be empty"
        end

        unless contents.last.end_with? "\n"
            errors.append "#{file}:#{contents.count}: #{error_type}: must have newline at the end of file"
        end
    end
end