girzel / ebdb

An EIEIO port of BBDB, Emacs' contact-management package
67 stars 12 forks source link

11,000,000+ line error file generated when importing my BBDB file #78

Closed hartzell closed 5 years ago

hartzell commented 5 years ago

I'm using the head of your github master branch, in emacs 26.1, from homebrew, on a mac.

I tried to use ebdb-migrate-from-bbdb migrate my bbdb.el and eventually ended up with an EBDB Migration Errors buffer with 11,697,109 lines. That emacs session is fairly unresponsive...

After a bit of poking around, I found that if I searched for lines that started with '^[' I could get the "main" lines, of which there were 14. Here's an anonymized example:

1113987:["Some" "Person" nil nil ("A Company") nil nil ("an-email@example.com" "another_email@example2.com") ((notes . "A notable note goes here....")) "44673efc-b514-4635-992a-dd3837f5b108" "2013-02-07" "2013-02-20" nil], error: (cl-no-applicable-method ebdb-record-add-org-role #s(#s(eieio--class ebdb-record-person "A record class representing a person." (#s(eieio--class ebdb-record-entity "An abstract class representing basic entities

A bit more poking convinced me that they all shared the error

error: (cl-no-applicable-method ebdb-record-add-org-role [ETCETCETC]

I can't see anything that the 14 error-prone entries have in common.

Here's the relevant a bit of my *Messages* buffer, starting with where my package manager (Straight) clones the repo.

Cloning ebdb...done
Building ebdb...done
If you want to keep ebdb, put (straight-use-package 'ebdb) in your init-file.
Upgrade from previous version of BBDB? (y or n) y
Loading EBDB sources...
~/.emacs.d/ebdb does not exist, create? (y or n) y
Upgrade from previous version of BBDB? (y or n) y
Migrating records from BBDB... 4760 records migrated
Initializing EBDB records... done
Migrating records from BBDB... 4760 records migrated
Mark set [2 times]
next-line: End of buffer
Loading /Users/hartzell/.emacs.d/var/cache/recentf...done
Mark set
Mark saved where search started
Mark set
Line 11697107
Mark set
Searched 1 buffer; 14 matches for "^\["
Mark saved where search started

Anything obvious that I can try? Any additional info I can pass you?

Thanks!

girzel commented 5 years ago

Good lord, that's horrible. Sorry for exploding your Emacs. Probably what happened is that I changed the calling convention for ebdb-record-add-org-role but didn't update the migration process: I've made that mistake before. It's never produced that much garbage before, though! Give me a day or so...

hartzell commented 5 years ago

Nothing to worry about, it made me laugh. Good thing that Emacs Is Not Slow(tm)....

I figure that there's some cross-product-like thing happening in there. I have about 4500 BBDB entries, so there's plenty of room for boom. The fact that only 14 seem to have exploded is hopeful, though I thought I'd be able to find something they obviously had in common.

girzel commented 5 years ago

Hmm, the calling convention is the same, so there must be something wrong with:

(seq-find
     (lambda (r)
       (string= o (ebdb-record-name r)))
    c-records)

Assuming that the no-applicable-method error comes from this form, which I would think it has to. Back to work...

girzel commented 5 years ago

The only way that error could be signalled is if the second argument to ebdb-record-add-org-role wasn't an organization, so the only thing I can think of is that you've got "person" records with the same name as organization records (or they've both got blank strings as names or something like that).

I've added a couple of checks to the function below. I hesitate to ask you to blow up your Emacs again, but would you eval this new definition (you might have to (require 'subr-x) first) and try again?

If it still doesn't work, I'll have to come up with a better way of cutting down the error message so I can actually see what it says.

(defun ebdb-migrate-from-bbdb (&optional file)
  (interactive)
  (require 'url-handlers)
  (require 'ebdb-org)
  (require 'ebdb-gnus)
  (with-current-buffer (find-file-noselect
            (or file
                (and (bound-and-true-p bbdb-file)
                 bbdb-file)
                (locate-user-emacs-file "bbdb" ".bbdb")))
    (when (and (/= (point-min) (point-max))
           (yes-or-no-p "Upgrade from previous version of BBDB? "))
      (let ((v-records (ebdb-migrate-parse-records))
        (target-db (ebdb-prompt-for-db nil t))
        (total 0)
        c-records duds)
    (message "Migrating records from BBDB...")
    (dolist (r v-records)
      (condition-case-unless-debug err
          (let ((orgs (ebdb-vrecord-organization r))
            (c-rec (ebdb-migrate-vector-to-class r))
            org)
        ;; Gives it a uuid if it doesn't already have one.
        (ebdb-db-add-record target-db c-rec)
        (when orgs
          (dolist (o orgs)
            ;; Make all the orgs into real records.
            (unless (string-blank-p o) ; There are many of these.
              (setq org (or (seq-find
                     (lambda (r)
                       (and (ebdb-record-organization-p t)
                        (string= o (ebdb-record-name r))))
                     c-records)
                    (let ((time (current-time)))
                      (ebdb-db-add-record
                       target-db
                       (make-instance
                    'ebdb-record-organization
                    :name (make-instance 'ebdb-field-name-simple :name o)
                    :timestamp
                    (make-instance 'ebdb-field-timestamp :timestamp time)
                    :creation-date
                    (make-instance 'ebdb-field-creation-date :timestamp time))))))
              ;; Create the connection between the org and the
              ;; person.
              (ebdb-record-add-org-role c-rec org)
              (unless (member org c-records)
            (push org c-records)))))
        (push c-rec c-records)
        (message "Migrating records from BBDB... %d" (cl-incf total)))
        (error
         (push (list r err) duds))))
    (when duds
      (pop-to-buffer
       (get-buffer-create "*EBDB Migration Errors*")
       '(nil . ((window-height . 10))))
      (insert "The records below could not be migrated:\n\n")
      (insert
       (mapconcat
        (lambda (r)
          (format "%S, error: %S" (car r) (cadr r)))
        duds "\n\n"))
      (fit-window-to-buffer)
      (goto-char (point-min)))
    (dolist (r c-records)
      (ebdb-init-record r))
    (setf (slot-value target-db 'dirty) t)
    (message "Migrating records from BBDB... %d records migrated"
         (length c-records))))))
hartzell commented 5 years ago

I LOVE blowing up my emacs! I’ll take it for a spin and let you know.

Are organization records something I have, or something ebdb builds up from e.g. company names?

g.

Sent from a device that makes me type with two fingers.

On Jun 27, 2019, at 6:04 PM, Eric Abrahamsen notifications@github.com wrote:

The only way that error could be signalled is if the second argument to ebdb-record-add-org-role wasn't an organization, so the only thing I can think of is that you've got "person" records with the same name as organization records (or they've both got blank strings as names or something like that).

I've added a couple of checks to the function below. I hesitate to ask you to blow up your Emacs again, but would you eval this new definition (you might have to (require 'subr-x) first) and try again?

If it still doesn't work, I'll have to come up with a better way of cutting down the error message so I can actually see what it says.

(defun ebdb-migrate-from-bbdb (&optional file) (interactive) (require 'url-handlers) (require 'ebdb-org) (require 'ebdb-gnus) (with-current-buffer (find-file-noselect (or file (and (bound-and-true-p bbdb-file) bbdb-file) (locate-user-emacs-file "bbdb" ".bbdb"))) (when (and (/= (point-min) (point-max)) (yes-or-no-p "Upgrade from previous version of BBDB? ")) (let ((v-records (ebdb-migrate-parse-records)) (target-db (ebdb-prompt-for-db nil t)) (total 0) c-records duds) (message "Migrating records from BBDB...") (dolist (r v-records) (condition-case-unless-debug err (let ((orgs (ebdb-vrecord-organization r)) (c-rec (ebdb-migrate-vector-to-class r)) org) ;; Gives it a uuid if it doesn't already have one. (ebdb-db-add-record target-db c-rec) (when orgs (dolist (o orgs) ;; Make all the orgs into real records. (unless (string-blank-p o) ; There are many of these. (setq org (or (seq-find (lambda (r) (and (ebdb-record-organization-p t) (string= o (ebdb-record-name r)))) c-records) (let ((time (current-time))) (ebdb-db-add-record target-db (make-instance 'ebdb-record-organization :name (make-instance 'ebdb-field-name-simple :name o) :timestamp (make-instance 'ebdb-field-timestamp :timestamp time) :creation-date (make-instance 'ebdb-field-creation-date :timestamp time)))))) ;; Create the connection between the org and the ;; person. (ebdb-record-add-org-role c-rec org) (unless (member org c-records) (push org c-records))))) (push c-rec c-records) (message "Migrating records from BBDB... %d" (cl-incf total))) (error (push (list r err) duds)))) (when duds (pop-to-buffer (get-buffer-create "EBDB Migration Errors") '(nil . ((window-height . 10)))) (insert "The records below could not be migrated:\n\n") (insert (mapconcat (lambda (r) (format "%S, error: %S" (car r) (cadr r))) duds "\n\n")) (fit-window-to-buffer) (goto-char (point-min))) (dolist (r c-records) (ebdb-init-record r)) (setf (slot-value target-db 'dirty) t) (message "Migrating records from BBDB... %d records migrated" (length c-records)))))) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

girzel commented 5 years ago

Are organization records something I have, or something ebdb builds up from e.g. company names?

Yup, EBDB organizations are records, same as people, and they're constructed at migration time from BBDB's organization strings.

hartzell commented 5 years ago

Eric Abrahamsen writes:

Are organization records something I have, or something ebdb builds up from e.g. company names?

Yup, EBDB organizations are records, same as people, and they're constructed at migration time from BBDB's organization strings.

Which bit of an BBDB record is the org string?

Here's one of the BBDB entries that's generating the error (with the names, email addrs and numbers obfuscated):

["Firstname" "Lastname" nil ("Studio" "Something Photography" "SomethingElse" "Another @ Something Else Photography") ("Yet Another Photography") nil nil ("mailbox@example.com" "mailbox@mac.com") ((notes . "keep me")) "6ef9e33a-a739-4163-b096-d0369236b217" "2002-01-29" "2009-03-12" nil]

Which bit is the organization string?

g.

hartzell commented 5 years ago

Eric Abrahamsen writes:

The only way that error could be signalled is if the second argument to ebdb-record-add-org-role wasn't an organization, so the only thing I can think of is that you've got "person" records with the same name as organization records (or they've both got blank strings as names or something like that).

I've added a couple of checks to the function below. I hesitate to ask you to blow up your Emacs again, but would you eval this new definition (you might have to (require 'subr-x) first) and try again?

If it still doesn't work, I'll have to come up with a better way of cutting down the error message so I can actually see what it says.

That seems to have worked.

The first time I tried it it threw an error:

Symbol’s function definition is void: ebdb-migrate-parse-records

Which prompted me to load ebdb-migrate. I called it again and ran into the same error, but in retrospect, when I called it again I ended up using the old function from ebdb-migrate.

I ended up prepending 'new-' to your new function, so that I could be sure what I was getting, and it ran w/out complaint.

I now notice that many queries return two identical entries in the EBDB buffer. I first noticed it when checking on the entries that were "causing" the problem, but it seems to be many/most records/queries.

Is that "normal"?

Thanks for the help!

g.

girzel commented 5 years ago

Here's one of the BBDB entries that's generating the error (with the names, email addrs and numbers obfuscated): ["Firstname" "Lastname" nil ("Studio" "Something Photography" "SomethingElse" "Another @ Something Else Photography") ("Yet Another Photography") nil nil ("mailbox@example.com" "mailbox@mac.com") ((notes . "keep me")) "6ef9e33a-a739-4163-b096-d0369236b217" "2002-01-29" "2009-03-12" nil]

Which bit is the organization string? g.

It's ("Yet Another Photography) (a list because BBDB records can have multiple organizations. The fields in order are:

firstname lastname affix aka organization phone address mail xfields uuid creation-date timestamp cache

Glad the migration seems to be working now! I'll push these changes shortly.

I don't think the migration process should have created duplicate records, I don't see where that would happened. Duplicates were a common problem with BBDB before it gained uuids, is it possible there were duplicates in your BBDB originally?

hartzell commented 5 years ago

Well,

That leaves me with this record:

(alice)[12:32:13]~>>rg "Santa Cruz Bicycles" ~/.emacs.d/var/bbdb/bbdb.el
1293:["First" "Last" nil nil ("Santa Cruz Bicycles") (["work" 831 123 4567 111] ["cell" 123 456 7890 0]) (["work" ("104 Bronson St") "Santa Cruz" "CA" "95062" "US"]) ("someone@santacruzbicycles.com") ((notes . "blah blah")) "186031e7-fee4-4b96-bdeb-f13187a7e366" "2011-10-16" "2011-10-16" nil]

That's the only occurrence of the string "Santa Cruz Bicycles" in the old bbdb.el.

g.

girzel commented 5 years ago

Wait these are the cases that caused the original error, or these are cases where you're seeing duplicates?

hartzell commented 5 years ago

Eric Abrahamsen writes:

Wait these are the cases that caused the original error, or these are cases where you're seeing duplicates?

These are the seven cases that generated 14 very very long lines of errors (each case was duplicated). The entries are not duplicated in the .bbdb.el file.

I thought that when I used your fixed function, I didn't get any errors, but.... I just deleted .emacs.d/ebdb (thinking that I might have duplicates in it because of my repeated attempts at importing) and I can't recreate the problem-free run.

With respect to the duplication, here's a reply I started to your previous comment:


Eric Abrahamsen writes:

[...] I don't think the migration process should have created duplicate records, I don't see where that would happened. Duplicates were a common problem with BBDB before it gained uuids, is it possible there were duplicates in your BBDB originally?

This record in bbdb.el

(alice)[12:37:13]~>>rg hartzell@alerce.com .emacs.d/var/bbdb/bbdb.el
1432:["George" "Hartzell" nil nil nil nil nil ("hartzell@alerce.com") nil "07776c99-0b9a-4b51-89f0-d6772da4922f" "2018-08-08" "2018-08-08" nil]
(alice)[12:37:21]~>>

seems to have created two ebdb records. When I use M-x ebdb to search for hartzell@alerce.com, I get a buffer containing this:

 George Hartzell
 mail: hartzell@alerce.com

 George Hartzell
 mail: hartzell@alerce.com

I do seem to have two records in my ebdb database:

(alice)[12:37:21]~>>rg hartzell@alerce.com .emacs.d/ebdb
76363:          :mail "hartzell@alerce.com"
189136:          :mail "hartzell@alerce.com"
(alice)[12:39:31]~>>

I have this record beginning at line 76350

    (ebdb-record-person "ebdb-record-person"
      :uuid
      (ebdb-field-uuid "ebdb-field-uuid"
        :uuid "07776c99-0b9a-4b51-89f0-d6772da4922f")
      :creation-date
      (ebdb-field-creation-date "ebdb-field-creation-date"
        :timestamp '(23402 12928))
      :timestamp
      (ebdb-field-timestamp "ebdb-field-timestamp"
        :timestamp '(23402 12928))
      :mail
      (list
        (ebdb-field-mail "ebdb-field-mail"
          :mail "hartzell@alerce.com"
          :priority primary))
      :name
      (ebdb-field-name-complex "ebdb-field-name-complex"
        :surname "Hartzell"
        :given-names '("George")))

and this record at line 189123

    (ebdb-record-person "ebdb-record-person"
      :uuid
      (ebdb-field-uuid "ebdb-field-uuid"
        :uuid "07776c99-0b9a-4b51-89f0-d6772da4922f")
      :creation-date
      (ebdb-field-creation-date "ebdb-field-creation-date"
        :timestamp '(23402 12928))
      :timestamp
      (ebdb-field-timestamp "ebdb-field-timestamp"
        :timestamp '(23402 12928))
      :mail
      (list
        (ebdb-field-mail "ebdb-field-mail"
          :mail "hartzell@alerce.com"
          :priority primary))
      :name
      (ebdb-field-name-complex "ebdb-field-name-complex"
        :surname "Hartzell"
        :given-names '("George")))

I just made several tries in which I

  1. used M-x straight-use-package to load ebdb
  2. opened my bugfix.el which has my renamed version of your modified migration (defun new-ebdb-migrate-from-bbdb (&optional file) ... and eval'ed the buffer
  3. M-x load-library ebdb-migrate
  4. M-x new-ebdb-migrate-from-bbdb

and the errors still happened and I have duplication as discussed above.

g.

girzel commented 5 years ago

Well dammit. Give me a day or so to figure it out. Thanks again for reporting this...

hartzell commented 5 years ago

My pleasure, and thank you for helping me work it out.

One thing that might be simpler would be if we worked with an updated/debugging/fixed branch of the project in the GitHub repo. I can easily configure straight.el to pull from the branch of my choice. That way I won't have any fears of how I slot your debugging versions into what I'm actually running. I worry that I've somehow screwed up what you've given me to run and that's the source of (or adding confusion to) the issue.

Thanks (again)!

g.

girzel commented 5 years ago

Well it's pretty obvious that there's a bug here beyond just patchwork debugging! But you're right it might be easier to work with a fix branch. I'll push a bug/78 branch in a bit, with my previous commit which I think is still desirable, though it clearly didn't do the whole trick. I'll continue to put further commits there.

girzel commented 5 years ago

Ha, that says "fixes #78", that was optimistic...

hartzell commented 5 years ago

I've configured things to use this branch:

  (use-package ebdb
    :straight (ebdb :type git :host github :repo "girzel/ebdb"
                    :branch "bug/78"
                    )
    )

and confirmed that I am using that branch (run in ~/.emacs.d/straight/repos/ebdb):

(alice)[15:05:52]ebdb>>git status
On branch bug/78
Your branch is up to date with 'origin/bug/78'.

nothing to commit, working tree clean

When I M-x ebdb-migrate-from-bbdb it asks if I want to upgrade the bbdb data (I respond 'y'), then it asks if it should create the ebdb file (I respond yes), then it asks (again) if it should upgrade the bbdb data (I respond 'y'). Then is successfully imports the file. *Messages* contains

Upgrade from previous version of BBDB? (y or n) y
Loading EBDB sources...
~/.emacs.d/ebdb does not exist, create? (y or n) y
Upgrade from previous version of BBDB? (y or n) y
Migrating records from BBDB... 4773 records migrated
Initializing EBDB records... done
Migrating records from BBDB... 4773 records migrated

If I search (M-x ebdb) for hartzell@alerce.com, I get the single record that I expect. If I search for a different address string (my father's...), I get two records.

At this point I notice that ~/.emacs.d/ebdb has not be updated on disk. Quitting emacs causes it to be written to.

Now if I ripgrep my email addr from the file, I see:

(alice)[14:51:38]~>>rg hartzell@alerce.com .emacs.d/ebdb
76458:          :mail "hartzell@alerce.com"
189344:          :mail "hartzell@alerce.com"
(alice)[14:52:35]~>>

Now if I start emacs and M-x ebdb and then search for hartzell@alerce.com, it messages Loading EBDB sources and goes away for a minute or two, then returns the single entry.

Searching for my father's address now returns a single record. There are also two entries for his email addr in the db:

(alice)[14:55:50]~>>rg noone@ptd.net .emacs.d/ebdb
76574:          :mail "noone@ptd.net"
189460:          :mail "noone@ptd.net"

FWIW, I've repeated this, twice. Quit emacs, removed ~/.emacs.d/ebdb, and run through the above.

It's tempting to think that it's something simple (e.g. it's asking me about migrating twice, then it migrates twice, then I have two copies), but I don't know how to explain the observation that, even though there are two entries for hartzell@alerce.com in ~/.emacs.d/ebdb, there's only one in the *EBDB* buffer when I query it.

girzel commented 5 years ago

Oooh, I think I've got it! The problem (it isn't your problem) is that you're calling ebdb-migrate-from-ebdb directly, rather than calling one of ebdb or ebdb-open, which both automatically detect that you have no EBDB records, and look for a BBDB file, and then offer to migrate. If you call the migration function directly it hits ebdb-prompt-for-db, which doesn't know we're migrating and tries to call ebdb-load when necessary, which essentially leads to a second nested load/migrate routine within the first, which leads to the double loading that you're seeing.

Anyway! Ignore all that, this shouldn't be too hard to fix.

hartzell commented 5 years ago

Seems like that's right. I just removed the ~/.emacs.d/ebdb file, ran M-x ebdb, told it to search for hartzell@alerce.com, it prompted to create a new ebdb, then asked to migrate my BBDB data. So far, there's only been one of anything that I've searched for.

girzel commented 5 years ago

I think what I'll do is just remove the excessively-clever auto-migrate routine from the regular load process. Keep the basic functions simple. Users should call the migration commands explicitly when they want them.

hartzell commented 5 years ago

I'm a big fan of simple. Sounds good to me. Thanks for all the work and the great tool.