DJF3 / Webex-Message-space-archiver

Archive Cisco Webex Teams Space messages to a single HTML file. Highly configurable (download files/images, sort order, max messages or age, avatars, etc)
Other
55 stars 19 forks source link

Webex Space Archiver

Because of a 'refocus' this repository will no longer be maintained by myself. Many thanks to the kinds words received from the users and the feedback they have been providing over the past years!

Important Release v30!** (check out new features in the release notes)

Features

Start

Configure

Release notes

Troubleshooting

Feedback & Support

published

Archive Cisco Webex space messages to a single HTML file. NOTE: This code is written for a customer as an example. I specifically wanted 1 (one) .py file that did everything. It's not beautiful code but it works :-) Feedback? Please go here and let me know what you think!

VIDEO

How to use & Demo

SCREENSHOT

Example HTML file of an archived Webex space:

                    

REQUIREMENTS

Features

  • Archives all messages in a space
  • Find space ID with built in search function
  • Batch archiving with multiple config files & commandline parameters NEW
  • Deal with threaded messages
  • Support for automatic and manual DST configuration ('summertime') NEW
  • Download images, files or both (with msg file date)
  • All files are organized: \spacenamefolder with subfolders for \files, \images, \avatars
  • Export space data to JSON and/or TXT file
  • Restrict messages by number of messages, number of days, from- date or from-to date
  • Display: messages grouped per month, with navigation at the top
  • Display: show full user names
  • Display: show (linked or downloaded) user avatars
  • Display: attached file-names + size
  • Display: "@mentions" in a different color
  • Display: quoted or formatted text
  • Display: external users in different color (users with other domain)
  • Display: images in popup when clicked
  • Support for blurring email addresses and names NEW
  • Print: just like it appears on the screen

It doesn't:

  • Clean your dishes
  • Download whiteboards (unless you post a snapshot)
  • Download/display files shared in external Enterprise Content Management systems (Onedrive/Sharepoint)
  • Display reactions to messages (not accessible via API)
  • Mow your neighbours lawn (I've tried)
  • Render cards

NOTE:

  • The message TIME displayed is in the UTC timezone. The timezone on your device defines how this UTC time/date is displayed. A message send at 12:43 CEST is stored as 10:43 UTC. When you change your timezone to PDT (UTC-7) it will be displayed as 03:43.
  • When printing the generated HTML file in Firefox: File, Print, check "print background colors and images", then print or save to PDF
  • To store your Webex token in an environment variable:
    • Windows: set WEBEX_ARCHIVE_TOKEN=YOUR_TOKEN_HERE
    • Mac: export WEBEX_ARCHIVE_TOKEN='YOUR_TOKEN_HERE'

Start

  1. Meet the requirements

  2. Run the script (python webex-space-archive.py) to create the configuration file "webexspacearchive-config.ini" (if it does not exist)

  3. In the webexspacearchive-config.ini file, save your developer token or (👍better!) create an environment variable called "WEBEX_ARCHIVE_TOKEN" with your token

  4. Run the script: python webex-space-archive.py

parameter
nothing use standard configuration .ini file
CONFIG_FILE use non-standard configuration .ini file
testspace.ini
SEARCH_STRING search for space name to get the space ID
ciscolive
SPACE_ID use this SPACE_ID with standard configuration .ini file
Y2lzY29zcGFyazovL3VzL0lfS05FVy95b3Vfd291bGRfdHJ5X2hhaGE
CONFIG_FILE SPACE_ID use non-standard configuration .ini file and provided SPACE_ID
a combination of examples above
SPACE_ID CONFIG_FILE use non-standard configuration .ini file and provided SPACE_ID

UPGRADE? Replace the .py file and keep the configuration file (.ini). To get changes in the .ini file, run the script once without .ini file and it will create one for you with the latest remarks and features.

Configuration

Edit the following variables in the python file:


Personal Token: you can find this on developer.webex.com, login (top right of the page) and then scroll down to "Your Personal Access Token". NOTE see the 'NOTE' section above to see how you can also use an environment variable to store your token!

mytoken = "YOUR_TOKEN_HERE"

NOTE: This token is valid for 12 hours! Then you have to get a new Personal Access Token.


Space ID: To find this, first save your developer token in the .ini file. Then run the script with a search arguments as a parameter. It will list all spaces+spaceId that match you search argument. Alternatively: go to Webex Developer List rooms, make sure you're logged in, set the 'max' parameter to '900' and click Run. If you don't see the RUN button, make sure 'test mode' is turned on (top of page, under "Documentation") TIP: to get the space ID of a space that you are in, in the client go to help / copy space details. Then in Webex talk to the bot "spaceidbot@webex.bot" and paste the space details. In return you get the space ID to be used here

myspaceid = "YOUR_SPACE_ID_HERE"


Downloadfiles: do you want to download images or images & files? Think about it. Downloading images and files can significantly increase the archive time and consume disk space. Downloaded images or files are stored in the subfolder. Options:

downloadfiles = info


UserAvatar: Do you want to show the user avatar or an icon? Avatars are not downloaded but linked. That means the script will get the user Avatar URL and use that in the HTML file. So the images are not downloaded to your hard-drive. Needs an internet connection in order to display the Avatar images.

useravatar = link


Max Messages: Restrict the number of messages that are archived. Some spaces contain 100,000 messages and you may not want to archive all of them. To archive the last 5000 messages:

maxtotalmessages = 5000


OutputFilename: Enter the file name of the output HTML file. If EMPTY the filename will be the same as the Archived Space name (recommended).

outputfilename = yourfilename.html


Sorting: of archived messages.

sortoldnew = yes


OutputJSON: Besides the .html file, how would you like to store your messages?

outputjson = no


DST: Besides the .html file, how would you like to store your messages? Both EU and US examples are shown in the .ini file.

dst_start = L,7,3 (last Sunday of March)

dst_stop = L,7,10 (last Sunday of October)


Blurring: Blur names and email addresses in html file

blurring = yes

  • empty :(default) no blurring
  • "yes" : Note that it is a VISUAL blur. Data can still be copy/pasted

Troubleshooting

Most of the errors should be handles by the script.

Release Notes

For old releasenotes click here

Enhancements in release v30 - March 19th 2023

Overall: increased output quality and precision. Support for DST, privacy blurring, bulk processing

Important Enhancements - all based on user requests

IMPROVEMENTS

FIXED

NOTE

Info

Feedback & Support

Submit here, open an issue or if you know my email address: send a message on Webex (not via email!).