JuliaText / WordNet.jl

A Julia package for Princeton's WordNet®.
Other
34 stars 11 forks source link

Return special value when words not found? #9

Closed chengchingwen closed 6 years ago

chengchingwen commented 6 years ago

Currently WordNet.jl get KeyError when searching for words not in the database, For example:

julia> db['v', "flew"]
ERROR: KeyError: key "flew" not found
Stacktrace:
 [1] getindex at ./dict.jl:474 [inlined]
 [2] getindex(::WordNet.DB, ::Char, ::String) at /home/peter/.julia/v0.6/WordNet/src/db.jl:20

Will it be better to return something like a empty Lemma for such situation?

oxinabox commented 6 years ago

I don't think so, no. I think an Error is what should happen. Can you motivate why the user might not want to be informed by an error? Any further processing of a fake lemma could lead to incorrect results. Or later more confusing errors, e.g does the empty lemma be have synset? does that synset have gloss?

BTW, in this case you wanted to use db['v', "fly"] which is the stem

cross-ref #10

chengchingwen commented 6 years ago

Can you motivate why the user might not want to be informed by an error? ...

This is truly a problem, either.

My origin thought was that I need some methods that enable me to handle some out-of-vocabulary words without trying to catch KeyError. I'll say that catching such a general error inside a special function is the last thing I would like to do.

I guess a better way to deal both of problems would be providing some methods that can also set default value?

Or maybe like python's nltk wordnet, leave the synsets function without having to actually touch the Lemma .

like

synsets(db::WordNet.DB, word::String)
synsets(db::WordNet.DB, word::String, pos::Char)
oxinabox commented 6 years ago

I see nothing wrong with:

try
    db['v', "flew"]
catch err
    err isa KeyError || rethrow(err)
    # Handle it however you want
end

KeyError exactly what this is, and its not like that error could be coming from somewhere else inside your try block.

I guess a better way to deal both of problems would be providing some methods that can also set default value?

I agree. I think db should implement haskey(db, key...), and get(::DB, key, default, delegating both the the matching functions for db.lemmas (the internal dict)

chengchingwen commented 6 years ago

I think db should implement haskey(db, key...), and get(::DB, key, default, delegating both the the matching functions for db.lemmas (the internal dict)

Does that mean we should implement DB as a subtype of Associative instead of a composite type of Dicts?

oxinabox commented 6 years ago

Does that mean we should implement DB as a subtype of Associative instead of a composite type of Dicts?

Those things are not mutually exclusive. If it were to be a subtype of Associative, it would still be a composite type of Dicts. It would just also implement the rest of the Associative interface.

Maybe, I'm trying to think of the pros and the cons. One con is that that informal interface is not yet documented, so it is nontrivial to work out if it is being met correctly.

chengchingwen commented 6 years ago

I see. Maybe at this moment, just implement haskey and get will be a better choice.

chengchingwen commented 6 years ago

What about

haskey(db::DB, pos::Char) = haskey(db.lemmas, pos)
haskey(db::DB, pos::Char, word::AbstractString) = haskey(db, pos) ? haskey(db.lemmas[pos], pos) : false

get(db::DB, pos::Char, word::AbstractString, default) = haskey(db, pos, word) ? db.lemmas[pos][word] : default
get(db::DB, word::AbstractString, pos::Char, default) = get(db, pos, word, default)
oxinabox commented 6 years ago

Looks sensible enough to me. Make a PR?