apache / openwhisk

Apache OpenWhisk is an open source serverless cloud platform
https://openwhisk.apache.org/
Apache License 2.0
6.5k stars 1.16k forks source link

archive slack history to new apache list #4272

Closed rabbah closed 5 years ago

rabbah commented 5 years ago

Our slack history is currently not archived on Apache servers. This issue is to figure out the best way to do that.

Exporting the history from Slack yields folders for each channel and files for each day of communication for that channel. At the very least we can serialize the export and send an email per day per channel to a specific Apache mailing list.

subject: [channel name] [day] where the content is: [timestamp] [user id|name] message

bdelacretaz commented 5 years ago

Thanks for this initiative!

I think the main goal is for those messages to be found at https://lists.apache.org/ but maybe some channels deserve to be directly exposed to the OpenWhisk dev and users lists? And probably create a dedicated slack-archive for the other channels.

If that's possible it would be nice to include a link that points to the position of the first message in the Slack channel. That might cause a forking of some discussions between Slack and lists but that's inevitable.

rabbah commented 5 years ago

If that's possible it would be nice to include a link that points to the position of the first message in the Slack channel.

Setting aside the question of whether I can construct the link from the information available - since the openwhisk slack is not a paid slack, the messages that are older than the most recent 10K will not be available publicly and so the link will not work as expected.

I could include the link anyway for posterity if it's feasible.

bdelacretaz commented 5 years ago

I could include the link anyway for posterity if it's feasible.

If that's easy I would include it and you can add a note about the 10k limit - that would at least allow people to join recent conversations there.

rabbah commented 5 years ago

Thanks for the feedback @bdelacretaz.

Some further refinement based on input from @csantanapr and @dgrove-oss from Slack:

rabbah commented 5 years ago

Some mining of previous discussion on the dev list

A daily Slack digests from Apache Pulsar as a model to follow perhaps: https://lists.apache.org/list.html?dev@pulsar.apache.org

Sample digest https://lists.apache.org/thread.html/d4889a10a74c73e925b83bab0fcfa5155bb488be4b1bd7dee16b1646@%3Cdev.pulsar.apache.org%3E

Using python Slacker package here is the API https://github.com/merlimat/slack-email-digest/blob/master/slack_email_digest.py#L55

chetanmeh commented 5 years ago

The primary api is documented at https://api.slack.com/methods/channels.history

chetanmeh commented 5 years ago

There is an official npm documented at https://slackapi.github.io/node-slack-sdk/web_api

rabbah commented 5 years ago

I adapted the slack to email digest noted above and ran the first digest moments ago. Here's the first post to the dev list https://lists.apache.org/thread.html/d7833101aa1dcc48a36e1916895e0afc4a4aed348d33535c79111beb@%3Cdev.openwhisk.apache.org%3E

I'm currently running the job out of my github repo but will move it to an apache repo once we figure out where it should sit.

rabbah commented 5 years ago

@bdelacretaz no dice on the link - not enough info to construct it. I could put a general link to the slack channel but not individual messages. I'll read more of the API for something better.

rabbah commented 5 years ago

Heh the email subject looks like it's from the future (ran in a different time zone)... 😅 Will tweak the app.

rabbah commented 5 years ago

I tweaked the digest to ignore channel join/leave events - this cuts down noise from channels that may only have such events on a given day.

If the volume of messages becomes high, we can look to consolidate the digests across channels. For now I'm leaving it separate since we primarily use a 1-3 channels.

bdelacretaz commented 5 years ago

I could put a general link to the slack channel but not individual messages

I think that's good enough, people can search for the messages if needed

Apart from that I think saying [slack-digest] in square brackets in the subject lines would make filtering easier. But the digests look good already, thank you!

rabbah commented 5 years ago

I will adjust the subject line as suggested.

More reading of the Slack API - there is a way to retrieve a permanent link for each message documented here https://api.slack.com/methods/chat.getPermalink

update: now the digest will include the permalinks.

rabbah commented 5 years ago

The code lives here and is running nightly using travis ci https://github.com/rabbah/slack-email-digest

Considering this task completed for now.

chetanmeh commented 5 years ago

Thanks @rabbah !! Later we can probably add this code to devtools repo and execute only if its triggered by cron

rabbah commented 5 years ago

👍 i wanted to get the upstream repo to add a (apache :) license first before committing to our code base.

https://github.com/merlimat/slack-email-digest/issues/1