Closed ashmaroli closed 1 year ago
Hello @jrfern,
The Dotclear importer has been rewritten based the export file you had provided.
I would now like to know the directory structure of the "media folder" (media.zip
unpacked) so as to implement the functionality behind --mediafolder
and maintain backwards-compatibility.
If you wish to try this out, you may edit your Gemfile as follows:
# Gemfile
gem "jekyll"
gem "jekyll-import", github: "jekyll/jekyll-import", ref: "refs/pull/512/head"
(There is no need to include activesupport
or any of the previous dependency gems).
TODO:
--mediafolder
_drafts
to avoid unintentional overwriting of existing namesake in _posts
.Great! Thank you very much. Now I get
invalid option: --mediafolder (OptionParser::InvalidOption)
When run without this option (and after following your instructions)
$ bundle exec jekyll import dotclear --datafile path_to_backup.txt
jekyll 4.3.2 | Error: Illegal quoting in line 1.
/usr/lib/ruby/3.1.0/csv/parser.rb:955:in `parse_quotable_robust': Illegal quoting in line 1. (CSV::MalformedCSVError)
I would now like to know the directory structure of the "media folder" (media.zip unpacked)
Inside the zip archive there's a "img" directory with the image files and subdirectories.
@jrfern The CSV::MalformedCSVError
is a bug that needs to be fixed. I would like to take a look at the actual backup file you used to test this branch. You may email the file to me directly instead of exposing it here.
(email address is attached to all of my commits on GitHub)
I would now like to know the directory structure of the "media folder" (media.zip unpacked)
Inside the zip archive there's a "img" directory with the image files and subdirectories.
In the backup file you provided previously, the value to key media.media_file
is "MiUser/250px-MonaLisaGraffiti.JPG"
. So, is the "img" dir parent directory to "MiUser`?
@jrfern The
CSV::MalformedCSVError
is a bug that needs to be fixed.
Yes, please, @ashmaroli
I would like to take a look at the actual backup file you used to test this branch. You may email the file to me directly instead of exposing it here. (email address is attached to all of my commits on GitHub)
ashmaroli at users.noreply.github.com? Impossible. I'm feeling silly, but I haven't been able to find your email, just your jekyll-talk, github, reddit, linkedin accounts... Mine is jrfern at gmail...
I would now like to know the directory structure of the "media folder" (media.zip unpacked) In the backup file you provided previously, the value to key
media.media_file
is"MiUser/250px-MonaLisaGraffiti.JPG"
. So, is the "img" dir parent directory to "MiUser`?
I don't understand, the MiUser phrase was a reference to the path. I unzipped the media.zip file, and it created media/img/image_files. Then run the command with --mediafolder path/media/img/ (as it never worked I don't know if it should be simply --mediafolder path/media/ ).
Hope this helps. One more thing, for my tests your suggested
gem "jekyll-import", github: "jekyll/jekyll-import", ref: "refs/pull/512/head
Should I change that now that the PR has been approved?
I'm feeling silly, but I haven't been able to find your email..
Ah! I should have just mentioned it right away instead.. it's ashmaroli at gmail..
run the command with --mediafolder path/media/img/ (as it never worked I don't know if it should be simply --mediafolder path/media/ )...
The original implementation (in existing releases) was to expect just path/media/
. The importer would then copy the contents into destination path assets/images/
. For example, say I provide --mediafolder media
. Then the importer would look for media/MiUser/250px-MonaLisaGraffiti.JPG
and if found, copy to assets/images/MiUser/250-px....JPG
.
The proposed implementation in this branch hasn't actually exposed the --mediafolder
yet. (So, it will always fail if you try). But it will eventually have similar behavior to maintain backwards-compatibility.
Should I change that now that the PR has been approved?
The reference is permanent. It would be valid even if the pull request branch gets deleted after the pull request is merged. However, since the pull request is still a work-in-progress, you may have to run bundle update jekyll-import
to get the latest state of this branch. (You don't have to update until I ask you for feedback.)
@ashmaroli Real backup file sent privately. I'm learning so much - thank you again.
Thanks @jrfern Received the backup file. Will use it to make changes to this branch.
Hello @jrfern
You may update your bundle reference to this branch by running bundle update jekyll-import
to test at your end.
I have also updated the importer documentation for better understanding. You may preview the document here.
Recuperated 60 entries into _drafts and their images! Great! I'm fighting at the moment with the paginate-v2 plugin and so can't check but I would say that the import worked.
Thank you again, @ashmaroli
Happy to hear that, @jrfern. Good luck tackling the pagination plugin 🙂 Thank you for testing and giving feedback.
First analysis of the new plugin. I moved the older post ('Informe K-12 Open Minds Conference 2007 - parte I: Europeos') to the posts directory.
Works quite well, not totally well.
"Informe K-12 Open Minds Conference 2007 - parte I: Europeos","<blockquote>\r\n<p><em>I was invited to attend the Conference held in Indianapolis. It was the start of something, I have to say. This is part one of my report in Spanish.</em></p>\r\n\r\n<p>La ventaja de dar tiempo a las cosas para ...
...
... y perfilar matices.</p>\r\n</blockquote>\r\n\r\n<p> </p>","<p style=\"text-align: justify;\">Escribo un informe sobre la K-12 Open Minds Conference....
This is converted into
<p style="text-align: justify;">Escribo un informe sobre la K-12 Open Minds Conference. Si eres impaciente puedes leer ya mucha información sobre lo que allí se habló en el <a href="http://k12openminds.wikispaces.com/" hreflang="es">K-12 Open Minds Conference Resource Site</a>.</p>
The blockquote (the whole header) is missing in the import.
In the backup
<p style=\"text-align: justify;\"><a class=\"media-link\" href=\"/dotclear/public/img/dia_1.jpg\"><img alt=\"\" class=\"media\" src=\"/dotclear/public/img/.dia_1_m.jpg\" style=\"float: left; margin: 0 1em 1em 0;\" /></a>
Now it is
<p style="text-align: justify;"><a class="media-link" href="/assets/dotclear/img/dia_1.jpg"><img alt="" class="media" src="/assets/dotclear/img/.dia_1_m.jpg" style="float: left; margin: 0 1em 1em 0;" /></a>
The images are treated as links. That was OK in the sense that there used to be two versions of each image, and the small one is a link to the big one, but there are no names starting with a dot in assets/dotclear and the link shoud be turned into an >img> tag.
So we miss the introductions to the entries and the images are treated as links. Can any of these points be fixed programmatically?
@jrfern Added support for importing excerpts. While I had seen the post_excerpt
field earlier, I did not realise that post_content
doesn't start with the excerpt. Jekyll-generated HTML generally has excerpt as the first paragraph of the contents. (The exception being when user had supplied a custom excerpt string to Jekyll during the build process).
ERROR
/assets/dotclear/img/.dia_1_m.jpg
not found.. but there are no names starting with a dot in assets/dotclear..
These files do not have separate identity in the media
table in the export file. So they won't be imported / mentioned in the log.
the link shoud be turned into an >img> tag.
They're already valid img tags. You don't see it or a placeholder holder for missing image because of CSS.
Great! The excerpt was the only problem with the import, the issue with the images was a problem with the backup, not the import.
From my side the new code works and I have recuperated the posts from this old blog.
@jekyllbot: merge +minor
@jekyllbot: merge +minor
@jekyllbot: merge +minor
Re-implement Dotclear importer based on export file provided by @jrfern in https://github.com/jekyll/jekyll-import/issues/510#issuecomment-1453747018.
This drops dependency on
activesupport
, includes associated tests and adds provided export file for future development.Closes #510