jupyter / newsletter

A repo for collecting content for the Jupyter Newsletter
BSD 3-Clause "New" or "Revised" License
13 stars 13 forks source link

Vote on whether to move to Medium as the editing/deployment platform for the newsletter. #27

Closed fperez closed 8 years ago

fperez commented 8 years ago

The question we're asking is: do we move to Medium as the platform for editing/posting the newsletter?

Please vote by using the 👍 /👎 emojis below for tallying. Obviously feel free to post a comment with further info if you'd like.

My personal vote is yes, based on the fact that the team doing most of the work on this found it to be a good solution that solved a number of problems they were having workflow-wise, it provides good email integration and stable, archival URLs for the posts. It's not perfect (multi-editor revisioning is pretty primitive/nonexistent, archival of source material in a cleaner format like markdown doesn't seem easy), but it works for what we need now.

Others may have different opinions, of course.

See #26 for background.

jasongrout commented 8 years ago

I think one of my primary concerns is the ability to get the material out of a proprietary system in the future if the need arises. When you say archiving is not easy...is it doable to get the material out, especially in an automated way (i.e., download all articles)?

willingc commented 8 years ago

Export from Medium is via Settings in their UI. @ruv7 and I did a test this morning and the ZIP file of all html contents of all articles was delivered in a matter of seconds.

FYI. Observing the CP newsletter folks at work, it does seem to me that Medium is an easier process than Ghost was.

jasongrout commented 8 years ago

Cool, thanks for checking. I defer to all y'all's judgement on the matter.

Ruv7 commented 8 years ago

Brian/Katie had previously voiced their yes votes on the transition. @JamiesHQ @Carreau - any feedback or thoughts? If not please indicate this so we can move forward.

Carreau commented 8 years ago

Well, I was not able to access, or received any invite to edit or view the edit interface so hard to judge, so I'll let involved person make a judgement.

Ruv7 commented 8 years ago

In the flurry of emails on the topic I can't seem to pinpoint how/when the draft of the first two posts which are currently live was sent out. Perhaps this came from @katiewhite360 and that's why I can't seem to find. Regardless of whether I sent them or Katie sent I thought you had received... This flags a problem I see with managing this process/task with such a large group of contributors. I think that to keep things moving it would be ideal to have one technical person co-lead the effort with me and have others who are interested in the topic serve as backups for when the other person is traveling or otherwise unavailable to help. @fperez @ellisonbg - please close the loop on proposed idea with me - if that doesn't work let's define who needs to say yes to this idea so we can make progress.

fperez commented 8 years ago

@willingc, could you post/send that test zip file? I'd like to run a quick test on it out of curiosity...

@Ruv7, could you make @Carreau an editor so he could have a look at the edit interface?

Thanks!

willingc commented 8 years ago

@fperez I shared the google folder with you.

Ruv7 commented 8 years ago

Only publication owners can delete a publication and add editors. @ellisonbg would need to do this. As an editor I can only add writers.

fperez commented 8 years ago

Thanks @willingc! I tested the conversion to markdown from the folder you gave me, using pandoc, and it mostly works pretty well. The following three lines of ipython do the trick:

files =! ls *.html.docx
for f in files:
    !pandoc -f docx -t markdown --atx-headers $f > {f.replace('html.docx', 'md')}

I've pasted the result here.

The original, for comparison, is here

A few notes on the result:

In summary, I think it's doable, but it will require a little bit of elbow grease.

My suggestion:

@Ruv7, how does that sound? I'm happy to help you out walking you through this process so that you and your team can do this, my estimate is that once you're familiar with the process, it will just add at most a few minutes per newsletter, which is peanuts compared to the actual work of writing it up.

That would give us a pretty solid archival story for the long run, I think

Is this acceptable to folks? BTW @jasongrout, thanks for pressing on this!!

ellisonbg commented 8 years ago

The publication owner is @ProjectJupyter and the authentication is the regular Twitter account. Matthias can add himself as an editor by logging onto Medium using this account.

On Thu, May 5, 2016 at 9:33 PM, Ana Ruvalcaba notifications@github.com wrote:

Only publication owners can delete a publication and add editors. @ellisonbg https://github.com/ellisonbg would need to do this. As an editor I can only add writers.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jupyter/newsletter/issues/27#issuecomment-217346513

Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

ellisonbg commented 8 years ago

What is the goal of having a separate archive of the posts that is updated upon each newsletter being published?

I agree that it is good to be able to get all of our data if/when we need it, but I don't see why we have to do that whole process every two weeks, even though we are not going to use the data in any useful way in the short term.

The danger of Medium disappearing without notice is extremely minimal - far lower than us loosing all the data from our Ghost blog because our deployment fails. That actually almost happened a while back. The backup of the Ghost blog even failed, but thankfully we had a second backup that did work.

I am fine if someone technical wants to volunteer to do this work regularly, but that should be entirely decoupled from the newsletter team's regular work.

fperez commented 8 years ago

I don't like the idea of "do your backups in an emergency". It's kind of saying "buy a fire extinguisher when your house is on fire".

I think if we want the data to be backed up, it should be done when created. The process I outlined is literally minutes of work (for a machine with python and pandoc). I don't think it's too high a burden to keep our data available in a clean format for future record.

Furthermore, it's a lot easier to spot little problems and fix them once a month for a couple of newsletters, than to go back and fix a ton of them retroactively. That would indeed be a lot of work.

I simply think that good data hygiene implies keeping the backed-up info in a good state as we go along. In the grand scheme of things, I think it's a very minimal burden for the upside of having all our newsletters cleanly archived for the long haul.

ellisonbg commented 8 years ago

We have way more important stuff going not backed up, but as long as someone technical wants to do this regularly, I don't mind.

On Thu, May 5, 2016 at 10:56 PM, Fernando Perez notifications@github.com wrote:

I don't like the idea of "do your backups in an emergency". It's kind of saying "buy a fire extinguisher when your house is on fire".

I think if we want the data to be backed up, it should be done when created. The process I outlined is literally minutes of work (for a machine with python and pandoc). I don't think it's too high a burden to keep our data available in a clean format for future record.

Furthermore, it's a lot easier to spot little problems and fix them once a month for a couple of newsletters, than to go back and fix a ton of them retroactively. That would indeed be a lot of work.

I simply think that good data hygiene implies keeping the backed-up info in a good state as we go along. In the grand scheme of things, I think it's a very minimal burden for the upside of having all our newsletters cleanly archived for the long haul.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jupyter/newsletter/issues/27#issuecomment-217353780

Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

ellisonbg commented 8 years ago

Also, if no one technical steps up to the plate, Ana or Katie could just download the .zip file and throw it into a dropbox folder with a name that reflects the date. Not pretty, but would at least cover the data loss risk.

On Thu, May 5, 2016 at 11:05 PM, Brian Granger ellisonbg@gmail.com wrote:

We have way more important stuff going not backed up, but as long as someone technical wants to do this regularly, I don't mind.

On Thu, May 5, 2016 at 10:56 PM, Fernando Perez notifications@github.com wrote:

I don't like the idea of "do your backups in an emergency". It's kind of saying "buy a fire extinguisher when your house is on fire".

I think if we want the data to be backed up, it should be done when created. The process I outlined is literally minutes of work (for a machine with python and pandoc). I don't think it's too high a burden to keep our data available in a clean format for future record.

Furthermore, it's a lot easier to spot little problems and fix them once a month for a couple of newsletters, than to go back and fix a ton of them retroactively. That would indeed be a lot of work.

I simply think that good data hygiene implies keeping the backed-up info in a good state as we go along. In the grand scheme of things, I think it's a very minimal burden for the upside of having all our newsletters cleanly archived for the long haul.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jupyter/newsletter/issues/27#issuecomment-217353780

Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

fperez commented 8 years ago

Is there a way for anyone else to download that data? I can't seem to... It would be good to ensure that folks other than Ana and Katie can actually grab the raw data.

fperez commented 8 years ago

ps - what else is critical that we aren't backing up? If that's the case, we should make a list and a plan to address it...

fperez commented 8 years ago

Actually, scratch that: the docx format seemed to be an artifact of Google Drive. @willingc, could you email me the zip file directly as generated by Medium itself? Would be best to test how much extra cleanup is really necessary straight from their archive, Google Drive seems to be doing an automatic export to Word as best I can tell.

willingc commented 8 years ago

@Ruv7 Would you mind emailing @fperez the zipped zip file that we downloaded from medium yesterday? I only have the unzipped in Google Docs. I'll be in the office in a bit if we need to download again 💃

I'll also take a look at the Medium API to see if there is anything in there. My guess is likely not, but I'll check.

ellisonbg commented 8 years ago

Not backed up:

On Fri, May 6, 2016 at 8:20 AM, Carol Willing notifications@github.com wrote:

@Ruv7 https://github.com/Ruv7 Would you mind emailing @fperez https://github.com/fperez the zipped zip file that we downloaded from medium yesterday? I only have the unzipped in Google Docs. I'll be in the office in a bit if we need to download again 💃

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jupyter/newsletter/issues/27#issuecomment-217472066

Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

Ruv7 commented 8 years ago

Yes I can send the unzipped file. Is this detail going to stop us from using Medium? If not does it make sense that we concurrently create the custom sub domain so that when we work through this we are ready to go? I'm wrapping up the design with Cameron today.

rgbkrk commented 8 years ago

Lately I've been viewing hackpad, slack, and Dropbox as ephemeral just-in-time thinking and design work. If it's meant to carry on longer, it must end up in a repository somewhere whether in code, doc, or issue.

As for assets, while not completely bulletproof, git+lfs has been pretty good so long as you set up the repo right.

I'm definitely not going to volunteer for setting up a job to slurp down our medium posts. I am quite happy with using Medium though, it sounds like a great and simple direction for the project.

ellisonbg commented 8 years ago

Pretty much all non-technical parts of the project are on Dropbox/Google/Hackpad. This includes staffing, budgets, financial, organizational, fund-raising, design, communications. Many/most of the people who work with these documents don't use GitHub at all.

jasongrout commented 8 years ago

The danger of Medium disappearing without notice is extremely minimal - far lower than us loosing all the data from our Ghost blog because our deployment fails. That actually almost happened a while back. The backup of the Ghost blog even failed, but thankfully we had a second backup that did work.

FYI, the purpose of my original query had more to do with how we get our content out if we decide in the future to migrate off of medium, not necessarily how to back up our content for when medium catastrophically fails. It's useful to think about backups of important content too, but that's also part of a much larger conversation about all of our "critical" content, and I don't think that larger conversation should necessarily impede progress here.

ellisonbg commented 8 years ago

Thanks for the clarification - from that perspective the HTML export that Medium has should be sufficient and we can do that if/when we decide to migrate elsewhere.

On Mon, May 9, 2016 at 6:23 AM, Jason Grout notifications@github.com wrote:

The danger of Medium disappearing without notice is extremely minimal - far lower than us loosing all the data from our Ghost blog because our deployment fails. That actually almost happened a while back. The backup of the Ghost blog even failed, but thankfully we had a second backup that did work.

FYI, the purpose of my original query had more to do with how we get our content out if we decide in the future to migrate off of medium, not necessarily how to back up our content for when medium catastrophically fails. It's useful to think about backups of important content too, but that's also part of a much larger conversation about all of our "critical" content, and I don't think that larger conversation should necessarily impede progress here.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jupyter/newsletter/issues/27#issuecomment-217862471

Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

Ruv7 commented 8 years ago

Next steps to closing out this issue:

@cameronoelsen - can you point me to towards the images you created, I'll upload and make sure they're working well

@katiewhite360 - please let me know when you have final drafted for the post. Link to today's DevMeeting: ttps://youtu.be/5FilnwzSicU

@minrk - thanks for volunteering to create the custom sub domain, what info do you need to create this?

willingc commented 8 years ago

@Ruv7 To help with the time difference, @minrk probably would need what you want as the custom subdomain name, i.e. newsletter.jupyter.org or puppiesandkittens.jupyter.org and possibly an email associated with the newsletter.

cameronoelsen commented 8 years ago

Here is the header: infographic_newsletter3

I can make one of these when we know the day we will be publishing: title_bar

Carreau commented 8 years ago

puppiesandkittens.jupyter.org

I vote for this one.

Carreau commented 8 years ago

and possibly an email associated with the newsletter.

That already exist and is noreply@jupyter.org, which redirect to projectjupyter+noreply@gmail.com, and hence is autotagged by the projectJupyter gmail account.

Carreau commented 8 years ago

@minrk - thanks for volunteering to create the custom sub domain, what info do you need to create this?

newsletter.jupyter.org also already exists and point to the newsletter on medium already:

Ruv7 commented 8 years ago

You beat me to this, I was going to check what the address was and I see that it is already done. @ellisonbg please confirm if anything else needs to happen. Our medium account can now be found at https://newsletter.jupyter.org/

ellisonbg commented 8 years ago

Great I think we are ready to go with the next one on medium - thanks everyone!

Sent from my iPhone

On May 10, 2016, at 3:56 PM, Ana Ruvalcaba notifications@github.com wrote:

You beat me to this, I was going to check what the address was and I see that it is already done. @ellisonbg please confirm if anything else needs to happen. Our medium account can now be found at https://newsletter.jupyter.org/

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

minrk commented 8 years ago

newsletter.jupyter should be all set up. I can point something else to it as well, if that was wrong.

fperez commented 8 years ago

Sorry, I replied to Ana via email separately. Yes, I don't think we need to hold this decision contingent on the archiving: what we get from medium in HTML isn't ideal, but it's workable.

I do think that our publicly facing "persistent" content, like blog and newsletter raw sources, should be stored in a repo somewhere always, along the lines @rgbkrk states above, as a matter of principle. It's viable to do at least the 80% version of that with minimal effort.