fisharebest / webtrees

Online genealogy
https://webtrees.net
GNU General Public License v3.0
490 stars 301 forks source link

Duplicated media after import GEDCOM #1714

Open UksusoFF opened 6 years ago

UksusoFF commented 6 years ago

Version: 1.7.9

Steps to reproduce:

  1. Create GEDCOM file with attached images or use this: https://www36.zippyshare.com/v/LhQhUXc4/file.html
  2. Create new tree in webtrees
  3. Go to import page and import GEDCOM file
  4. Go to import page again and check "Keep media objects" and import GEDCOM file again
  5. Go to media list and see only one file, but in individuals media tab contains two media links with same id

After third and more import tryes duplicates not incrementing.

fisharebest commented 6 years ago

I think this is working correctly.

The "Keep media objects" option is designed for users who

1) edit the GEDCOM using a desktop application 2) use webtrees to display the data 3) add media objects in webtrees 4) use a desktop program that does not support (or deletes) media objects.

When you use this option, webtrees expects to import a GEDCOM file that does not contain media.

It will then try to merge the current media objects and links with the new GEDCOM file.

UksusoFF commented 6 years ago

It will then try to merge the current media objects and links with the new GEDCOM file.

So result must content merged results without duplicates. Rigth?

If this option not checked import delete all exist media and create new from GEDCOM. And we lost all data added in webtrees (such as notes). It's also normal behavior?

fisharebest commented 6 years ago

So result must content merged results without duplicates. Rigth?

We do not check for duplicates. The media have been deleted from the GEDCOM file.

If the media objects exist in both the GEDCOM file and webtrees, then you will get duplicates.

And we lost all data added in webtrees (such as notes). It's also normal behavior?

Explain exactly what you did and what happened. Tell me how I can reproduce the problem.

UksusoFF commented 6 years ago

We have notice on import page:

This will delete all the genealogy data from ‘testtetsetes’ and replace it with data from a GEDCOM file.

Perhaps this does not quite coincide with expectations. I expect that will affected only for genealogy data (such as individuals and their relations) but in fact it removes all data.

Tell me how I can reproduce the problem.

  1. Create GEDCOM file with attached images or use this: https://www36.zippyshare.com/v/LhQhUXc4/file.html
  2. Create new tree in webtrees
  3. Go to import page and import GEDCOM file
  4. Add notes to attached image or other data on /addmedia.php?action=editmedia&pid=M70
  5. Go to import page again and import GEDCOM file again
  6. Notes added on 4 step are missed because media deleated and created new with other id
UksusoFF commented 6 years ago

As I can see here createMediaObject return exists media id if they matched by filename.

So convertInlineMedia can be changed to something looks like below and we not lose media data and don't have duplicates with "Keep media objects".


    /**
     * Extract inline media data, and convert to media objects.
     *
     * @param Tree $tree
     * @param string $gedrec
     *
     * @return string
     */
    public static function convertInlineMedia(Tree $tree, $gedrec)
    {
        while (preg_match('/\n1 OBJE(?:\n[2-9].+)+/', $gedrec, $match)) {
            $inline = $match[0];
            $media = self::createMediaObject(1, $match[0], $tree);
            $gedrec = self::attachMediaObject($gedrec, $inline, $media);
        }
        while (preg_match('/\n2 OBJE(?:\n[3-9].+)+/', $gedrec, $match)) {
            $inline = $match[0];
            $media = self::createMediaObject(2, $match[0], $tree);
            $gedrec = self::attachMediaObject($gedrec, $inline, $media);
        }
        while (preg_match('/\n3 OBJE(?:\n[4-9].+)+/', $gedrec, $match)) {
            $inline = $match[0];
            $media = self::createMediaObject(3, $match[0], $tree);
            $gedrec = self::attachMediaObject($gedrec, $inline, $media);
        }

        return $gedrec;
    }

    /**
     * Attach inline media only if not attached yet.
     *
     * @param $gedrec
     * @param $inline
     * @param $media
     *
     * @return string
     */
    public static function attachMediaObject($gedrec, $inline, $media)
    {
        if (strpos($gedrec, $media) === false) {
            return str_replace($inline, $media, $gedrec);
        } else {
            return str_replace($inline, '', $gedrec);
        }
    }