emacs-helm / helm-dictionary

Helm source for looking up dictionaries
31 stars 12 forks source link

unefficient code in helm-dictionary-transformer and error handling #13

Closed thierryvolpiatto closed 10 years ago

thierryvolpiatto commented 10 years ago

You are computing all candidates with remove-if, then recomputing them a second time with the loop, inefficient.

(defun helm-dictionary-transformer (candidates)
  "Formats entries retrieved from the data base."
  (let ((cands (remove-if (lambda (i) (string-match "\\`#" i)) candidates)))
    (loop for i in cands

Also please check as I done for existence of entry, (car entry), (cdr entry), i.e if one line doesn't have a " :: " separator, the loop fail.

tmalsburg commented 10 years ago

You are computing all candidates with remove-if, then recomputing them a second time with the loop, inefficient.

I did it for the sake of readability, but wasn't happy with it either. I will fix it.

Also please check as I done for existence of entry, (car entry), (cdr entry), i.e if one line doesn't have a " :: " separator, the loop fail.

In a well-formed dictionary the " :: " should never be missing (and I never encountered a case where it was missing). If it is missing, that means that something's wrong with the database and I prefer to get some kind of error message in that case.

thierryvolpiatto commented 10 years ago

Titus von der Malsburg notifications@github.com writes:

In a well-formed dictionary the " :: " should never be missing (and I never encountered a case where it was missing). If it is missing, that means that something's wrong with the database and I prefer to get some kind of error message in that case.

When it is missing it is in the headers, and yes it is happening in the two dictionaries I have downloaded (en-fr and fr-en). So it doesn't harm checking this for safety.

Thierry Get my Gnupg key: gpg --keyserver pgp.mit.edu --recv-keys 59F29997

tmalsburg commented 10 years ago

and yes it is happening in the two dictionaries I have downloaded (en-fr and fr-en)

Ok, in that case we have to do something about it. I don't feel comfortable with simply ignoring malformed entries but bothering the user with errors may be even worse because it's unlike that the user will fix them.

tmalsburg commented 10 years ago

Fixed both issues in 705b76d4d7eadf20c4d23f2b9d4df5346d22fca0. Thanks, Thierry, for the contribution.

thierryvolpiatto commented 10 years ago

Titus von der Malsburg notifications@github.com writes:

and yes it is happening in the two dictionaries I have downloaded (en-fr and fr-en)

Ok, in that case we have to do something about it. I don't feel comfortable with simply ignoring malformed entries

They are not malformed entries and if they are they are in headers (missing " :: ") Otherwise in real entries (not headers) you may have in some places lines like this:

a {art} /eɪ/ (an) SEE: an ::

In this case in your code:

(split-string "a {art} /eɪ/ (an) SEE: an ::" " :: ") OK=> ("a {art} /eɪ/ (an) SEE: an ::") (split-string (cadr '("a {art} /eɪ/ (an) SEE: an ::")) " | ") ERROR=> split-string: Wrong type argument: stringp, nil

So you should check existence of cdr.

    for headerp = (string-match "\\`#" i)
    ;; no need to compute entry if it is a header line
    for entry = (and (not headerp) (split-string i " :: "))
    ;; no need to compute l1terms if entry have not been computed.
    for l1terms = (and entry (split-string (car entry) " | "))
    ;; If entry have been computed check its cdr
    for l2terms = (helm-aif (cdr entry) (split-string (car it) " | "))

Thierry Get my Gnupg key: gpg --keyserver pgp.mit.edu --recv-keys 59F29997

tmalsburg commented 10 years ago

Of course I fully agree about the headers.

Could you please provide me with a link to your dictionary for testing? My dictionaries don't have these cases. I would like to check if there are also other irregular cases.

tmalsburg commented 10 years ago

Added a fix for the cases where the entry ends with " ::". See 8b63d4738e6f91d1d72b965982bdad065fb55337.

My aim was to keep sanity checking and the actual processing of entries separate to improve readability.

tmalsburg commented 10 years ago

Should be fixed now. See last comment. I will close this for now.