Open hellantos opened 2 years ago
ok. i may run an experiment to see how it works.
On Tue, Jan 25, 2022 at 9:46 AM G.A. vd. Hoorn @.***> wrote:
@robinsonmm https://github.com/robinsonmm: could you contact me about exporting old posts from our current host?
— Reply to this email directly, view it on GitHub https://github.com/ipa-cmh/rosindustrial-website/issues/13#issuecomment-1021324967, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHJWEEFGK44JP3TVDWG5T6DUX3AURANCNFSM5JUEKAOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Blogs are here, but is is everything.
thanks, this looks like it's exactly what we need.
I thought it looked promising as i scanned it. I didn't go through how the past blog posts are organized/visible, but it seemed to include a lot of content of main/sub-pages.
On Wed, Jan 26, 2022 at 3:47 AM G.A. vd. Hoorn @.***> wrote:
thanks, this looks like it's exactly what we need.
— Reply to this email directly, view it on GitHub https://github.com/ipa-cmh/rosindustrial-website/issues/13#issuecomment-1022032515, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHJWEECFOCUUOPCBWF5IRGTUX67LVANCNFSM5JUEKAOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
@robinsonmm: did you find any way to export content in a form other than HTML?
Perhaps SP stores content in a format which doesn't mix formatting with content?
The export you provided seems essentially like the RSS feed, which includes all content, but in its final "rendered" form. I can only include everything verbatim, meaning with all the embedded HTML tags (edit: that's not entirely true, but having it separate would make things easier and probably more consistent).
That's not too nice, and if we could avoid that, it would be great.
I'll check again. This is the direct export from the automated export function. I'll check and see if they have export options under the advanced settings.
On Thu, Jan 27, 2022, 12:20 PM G.A. vd. Hoorn @.***> wrote:
@robinsonmm https://github.com/robinsonmm: did you find any way to export content in a form other than HTML?
Perhaps SP stores content in a format which doesn't mix formatting with content?
The export you provided seems essentially like the RSS feed, which includes all content, but in its final "rendered" form. I can only include everything verbatim, meaning with all the embedded HTML tags.
That's not too nice, and if we could avoid that, it would be great.
— Reply to this email directly, view it on GitHub https://github.com/ipa-cmh/rosindustrial-website/issues/13#issuecomment-1023514580, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHJWEED7UC3HVJL3XZAY4I3UYGEHDANCNFSM5JUEKAOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
In the meantime I've put something together which does a decent job:
Rendered version (original here):
There are some weird things (like duplicated images) and I need to filter out Proofpoint urldefense
URLs, but other than that, for a fully automated conversion, it's quite OK I believe.
Question to all: how do we want to list authors?
There doesn't appear to be any real consistency in the SP export. It's a mix of names and email addresses (ie: some posts have names, others addresses).
I can easily map addresses to names (simple search-replace), or we could amend names with addresses, or the other way around (something like John Doe <john.doe@something.somewhere>
, I guess that's supported @ipa-cmh).
What would our preference be?
Related: do we want to "port" the tags in some way?
If I understand this correctly it should be possible to keep them.
We have quite a few posts with tags.
Current state of conversion: here.
This is automated, so there are bound to be weird things.
Edit: moved outstanding issues to #18.
Question to all: how do we want to list authors?
There doesn't appear to be any real consistency in the SP export. It's a mix of names and email addresses (ie: some posts have names, others addresses).
I can easily map addresses to names (simple search-replace), or we could amend names with addresses, or the other way around (something like
John Doe <john.doe@something.somewhere>
, I guess that's supported @ipa-cmh).
Jekyll is pretty stupid, author is just a text field, no input verification, so yes it is supported. Not sure about prefered, but I would be okay with mail adresses mixed with author names. Extracting names from mail addresses will probably have some cases where it does not work...
@gavanderhoorn
Related: do we want to "port" the tags in some way?
If I understand this correctly it should be possible to keep them.
We have quite a few posts with tags.
Yes, you simply have to add the tags: [tag1] [tag2] [...]
to the yaml frontmatter of each post. Careful, tags that consist of two words probably need to be added with a single tag: [tag 1]
statement.
I'll probably need to do some magic to have them in the website though...
but I would be okay with mail adresses mixed with author names. Extracting names from mail addresses will probably have some cases where it does not work...
no, that's not what I'm suggesting.
There are about 28 post authors, but in some cases, John Doe posted as John Doe
, when in other cases, he posted as john.doe@something.somewhere
.
It's the same person, just different "identifiers".
What I'm suggesting is to map john.doe@something.somewhere
to John Doe
. So whenever the converter script encounters john.doe@something.somewhere
as the post author, it will just replace it by John Doe
.
(I've already implemented this, it's trivial).
Yes, you simply have to add the
tags: [tag1] [tag2] [...]
to the yaml frontmatter of each post. Careful, tags that consist of two words probably need to be added with a singletag: [tag 1]
statement.
already done, see 2013-10-3-nist-ric-americas-member-of-the-week.md for instance.
re: spaces: I've used the all-lowercase-separated-by-hyphens version, not the regular "words with spaces" version (see https://github.com/ipa-cmh/rosindustrial-website/issues/13#issuecomment-1024133108).
Another idea: would we want to include a link to the original post on squarespace at the top of the converted one?
Given the risk of lossy-conversion, this might actually be a nice thing to do.
It would also make it easier for us to compare the original against the migrated post.
(we can always remove them later quite easily)
Edit: added this in a2a96f8a50a53484f92544eb82a7f2f7e750aff3.
The 'real' deployment should of course link to a backup (sub)domain.
I just noticed there are a couple posts included in the export which are marked as draft
.
Would we want to migrate those, or exclude them/
@robinsonmm @ipa-cmh @Leesls ?
Exclude drafts.
Also i have been guilty of entering them for my staff, so i could see where my email address may come in , though the author is someone else. I see that example with the ROS Additive Mfg has my email address as the title but that is Victor LaMoine, for instance.
As we'll import everything, I've updated the title of the issue.
Exclude drafts.
Ok.
Also i have been guilty of entering them for my staff, so i could see where my email address may come in , though the author is someone else. I see that example with the ROS Additive Mfg has my email address as the title but that is Victor LaMoine, for instance.
this is not something I can automate.
Someone will have to go through the converted posts and update the front matter.
I'll check again. This is the direct export from the automated export function. I'll check and see if they have export options under the advanced settings.
@robinsonmm: any luck?
it's ok if there's no immediately better export format.
I just wanted to know whether I should spend time improving the conversion, or whether I should wait on what you'd be able to get out of SP.
Per squarespace after working on this: You can export certain content from your Squarespace site into an .xml file. This is useful if you want to export content to WordPress. Not everything will export, as many features rely on our platform’s JavaScript and CSS.
What content will and won't export Because of how WordPress is designed, it's not possible to import all types of Squarespace content. Our .xml file is set up to export primarily the content that will import to your WordPress site.
What content exports Layout pages One blog page, including all of its posts and up to 1000 comments per post Text blocks Image blocks Text from other blocks like the embed block, Twitter block, and Instagram block will export with minimum structure Gallery pages Project pages
What content won't export Other page types (including album pages, cover pages, index pages, info pages, events pages, portfolio pages, and store pages) Content in page-specific headers, footers, and sidebars More than one blog page Folders Audio blocks Product blocks Video blocks Drafts Style changes Custom CSS Note: Some content, such as linked .pdf files, will export in the .xml file but can't import to a WordPress site. Visit their documentation for more details.
Sounds like the export you found is the one they refer to in their email?
Yes. It is geared toward WordPress. All the linked content it seems would have to be recreated. And no event pages, so leveraging event pages to add back in the presentations etc, all of that will be lost. Oh well.
@robinsonmm: could you contact me about exporting old posts from our current host?