MozillaFoundation / mofo-devops

Mozilla Foundation DevOps Plans, Issues, Discussions
12 stars 5 forks source link

Discuss: Prep story engine for Pulse #612

Closed xmatthewx closed 4 years ago

xmatthewx commented 6 years ago

MoFo has a pile of interviews in StoryEngine that we will put into Pulse profiles before we let go off that site.

I don't think we should attempt an import. But, to expedite the process, can we extract all the content from WP into individual text files and convert html to markdown? We can aim for 80% quality, ignoring edge cases.

A quick export to prep content will enable program staff to simply copy and paste the content, and not fuss with inline style.

Thoughts? @cadecairos @gideonthomas @alanmoo @jessevondoom

cadecairos commented 6 years ago

@xmatthewx We transferred ownership of StoryEngine to Loup design a few months ago. Are we planning to scape the data off of their website? I think we might want to ask them first, since they own the rights to that content now.

xmatthewx commented 6 years ago

this plan is their plan. we discussed moving the mozilla storyengine interviews to pulse long ago. we just didn't want to prioritize it, knowing our work on fellow profiles would eventually enable it.

i'm sure they'd give us a sql dump or a wp xml export. i can contact them after we have a sense of whether that's useful for us.

cadecairos commented 6 years ago

It doesn't feel like a good use of DevOps time to manually format text data for an import into the CMS. There are plenty of free online tools that can convert HTML to Markdown for us. If we can provide program staff with links to these tools it won't take much of their time to format the exported content themselves on a case by case basis.

xmatthewx commented 6 years ago

Cool. I'll try and take this. I feel like 2 hours from me could save them a dozen hours. And I'm not even sure who "them" is.

Chris do we have a SQL back up of the site when we did the hand off? Or should I reach out to Christine P?

cadecairos commented 6 years ago

I believe I did capture a backup. I will have to confirm that.

cadecairos commented 6 years ago

I can confirm there's an encrypted snapshot of the storyengine wordpress site saved in the mofo-archives S3 bucket.

xmatthewx commented 6 years ago

Great. If you can give me a copy of the SQL, I'll see if I can prep the content.

You can strip or skip sensitive info in these tables. Or I can do it:

I only need post content. Terms (tags, cats) could be useful.

More on WP DB tables

edit: it's so weird to be reading WP documentation. i spent a lot of time here like 9 years ago.

cadecairos commented 4 years ago

This issue seems to have died? reopen if we need to complete anything here.