Closed thierryvolpiatto closed 10 years ago
You are computing all candidates with remove-if, then recomputing them a second time with the loop, inefficient.
I did it for the sake of readability, but wasn't happy with it either. I will fix it.
Also please check as I done for existence of entry, (car entry), (cdr entry), i.e if one line doesn't have a " :: " separator, the loop fail.
In a well-formed dictionary the " :: " should never be missing (and I never encountered a case where it was missing). If it is missing, that means that something's wrong with the database and I prefer to get some kind of error message in that case.
Titus von der Malsburg notifications@github.com writes:
In a well-formed dictionary the " :: " should never be missing (and I never encountered a case where it was missing). If it is missing, that means that something's wrong with the database and I prefer to get some kind of error message in that case.
When it is missing it is in the headers, and yes it is happening in the two dictionaries I have downloaded (en-fr and fr-en). So it doesn't harm checking this for safety.
Thierry Get my Gnupg key: gpg --keyserver pgp.mit.edu --recv-keys 59F29997
and yes it is happening in the two dictionaries I have downloaded (en-fr and fr-en)
Ok, in that case we have to do something about it. I don't feel comfortable with simply ignoring malformed entries but bothering the user with errors may be even worse because it's unlike that the user will fix them.
Fixed both issues in 705b76d4d7eadf20c4d23f2b9d4df5346d22fca0. Thanks, Thierry, for the contribution.
Titus von der Malsburg notifications@github.com writes:
and yes it is happening in the two dictionaries I have downloaded (en-fr and fr-en)
Ok, in that case we have to do something about it. I don't feel comfortable with simply ignoring malformed entries
They are not malformed entries and if they are they are in headers (missing " :: ") Otherwise in real entries (not headers) you may have in some places lines like this:
a {art} /eɪ/ (an) SEE: an ::
In this case in your code:
(split-string "a {art} /eɪ/ (an) SEE: an ::" " :: ") OK=> ("a {art} /eɪ/ (an) SEE: an ::") (split-string (cadr '("a {art} /eɪ/ (an) SEE: an ::")) " | ") ERROR=> split-string: Wrong type argument: stringp, nil
So you should check existence of cdr.
for headerp = (string-match "\\`#" i)
;; no need to compute entry if it is a header line
for entry = (and (not headerp) (split-string i " :: "))
;; no need to compute l1terms if entry have not been computed.
for l1terms = (and entry (split-string (car entry) " | "))
;; If entry have been computed check its cdr
for l2terms = (helm-aif (cdr entry) (split-string (car it) " | "))
Thierry Get my Gnupg key: gpg --keyserver pgp.mit.edu --recv-keys 59F29997
Of course I fully agree about the headers.
Could you please provide me with a link to your dictionary for testing? My dictionaries don't have these cases. I would like to check if there are also other irregular cases.
Added a fix for the cases where the entry ends with " ::". See 8b63d4738e6f91d1d72b965982bdad065fb55337.
My aim was to keep sanity checking and the actual processing of entries separate to improve readability.
Should be fixed now. See last comment. I will close this for now.
You are computing all candidates with remove-if, then recomputing them a second time with the loop, inefficient.
Also please check as I done for existence of entry, (car entry), (cdr entry), i.e if one line doesn't have a " :: " separator, the loop fail.