rusq / slackdump

Save or export your private and public Slack messages, threads, files, and users locally without admin privileges.
GNU General Public License v3.0
1.59k stars 72 forks source link

attachments/images preview/thumbnails download #111

Open csaket opened 2 years ago

csaket commented 2 years ago

Is your feature request related to a problem? Please describe. Thumbnails for images in channels etc are not downloaded and instead refer to the original thumbnail on slack. Due to this when viewing the messages in the group, the thumbnails are broken and do not display due to CORB issues and mimetype issues. Additionally, expecting an archival tool to pull down all such assets or at least have an option for the same. Pretty soon my company is moving away from slack and so we won't have access to the original images etc.

Describe the solution you'd like Download thumbnails as well just like the actual images/attachments.

rusq commented 2 years ago

Hey @csaket, it is possible to pull the thumbnails along with files. Biggest question is - where to pull those thumbnails to?

Could you tell more about "thumbnails are broken and do not display":

  1. Which program are you using to preview the export?
  2. Do you know which directory the program expects to find those thumbnails?
csaket commented 2 years ago

Hi.

I am using the slack-export-viewer to view the export. The thumbnails are just urls that the json shows as pointing to files.slack.com/... Unfortunately even the downloaded images are also not shown, the link is incorrect and even correcting it does not work.

Showing a fragment of the message json with annotations

   ...
    "latest_reply": "1630598804.011500",
    "files": [
      {
        "id": "XXXXXXXXXXX",
       ...
        "name": "image.png",
        "title": "image.png",
        "mimetype": "image/png",
        ...
        "mode": "hosted",
        "editable": false,
        "is_external": false,
        "external_type": "",
        "size": 80408,
        "url": "",
        "url_download": "",
        "url_private": "attachments/F02XXXXXXXX-image.png", <= Does not display when clicked as the URL ends up being http://localhost:5000/attachments/F02XXXXXXXX-image.png. the file is actually located in /channelname/attachments/F02XXXXXXXX-image.png
        "url_private_download": "attachments/F02XXXXXXXX-image.png",
        "original_h": 758,
        "original_w": 718,
        "thumb_64": "https://files.slack.com/files-tmb/T020XXXXXXX-F02XXXXXXXX-280XXXXXXX/image_64.png", <- These are the thumbnails that are not downloaded
        "thumb_80": "https://files.slack.com/files-tmb/T020XXXXXXX-F02XXXXXXXX-280XXXXXXX/image_80.png", <- These are the thumbnails that are not downloaded
        "thumb_160": "https://files.slack.com/files-tmb/T020XXXXXXX-F02XXXXXXXX-280XXXXXXX/image_160.png", <- These are the thumbnails that are not downloaded
        "thumb_360": "https://files.slack.com/files-tmb/T020XXXXXXX-F02XXXXXXXX-280XXXXXXX/image_360.png", <- These are the thumbnails that are not downloaded
        "thumb_360_gif": "",
        "thumb_360_w": 341,
        "thumb_360_h": 360,
        "thumb_480": "https://files.slack.com/files-tmb/T020XXXXXXX-F02XXXXXXXX-280XXXXXXX/image_480.png", <- These are the thumbnails that are not downloaded
        "thumb_480_w": 455,
        "thumb_480_h": 480,
        "thumb_720": "https://files.slack.com/files-tmb/T020XXXXXXX-F02XXXXXXXX-280XXXXXXX/image_720.png", <- These are the thumbnails that are not downloaded
        "thumb_720_w": 682,
        "thumb_720_h": 720,
        ...
        "permalink": "https://xxxxxx.enterprise.slack.com/files/U028XXXXXXX/F02XXXXXXXX/image.png",
        "permalink_public": "https://slack-files.com/T020XXXXXXX-F02XXXXXXXX-1a53XXXXXX",
...

The browser blocks these thumbnails from displaying with a CORB error Cross-Origin Read Blocking (CORB) blocked cross-origin response https://xxxxxx.enterprise.slack.com/?redir=%2Ffiles-tmb%2FT020XXXXXXX-F03HXXXXXXXX-781033a93b%2Fimage_360.png with MIME type text/html. See https://www.chromestatus.com/feature/5629709824032768 for more details

rusq commented 2 years ago

Thanks @csaket , I will have a look. Last time I experimented with slack-export-viewer I noticed that it doesn't allow to download files, even if the file link is pointing to them.

In scope of this request I think at least the following should be completed 100%:

The other problem is displaying the files in slack-export-viewer, I can see that there could be 2 possible solutions:

I also suggest for scope of this ticket that we switch to mattermost format as the default one. I quite like it in a sense that all the thumbnails can be downloaded to the directory, along with the original file. I.e., if there is a file with ID FABCD123 and name "file.jpg", and "file_thumb_360.jpg" would be it's thumbnail thy both can be placed to the following directory within the export structure:

.
+- __uploads
|  +- FABCD123
|     +- file.jpg
|     +- file_thumb_360.jpg  
+- general
   +- 2022-01-01.json    ; has a reference to FABCD123/file.jpg and thumbnails
csaket commented 2 years ago

Found an alternative viewer that I have been able to modify and display the downloaded images inline. https://github.com/bsimpson/slack_viewer Have not tried the thumbnails yet.

rusq commented 2 years ago

Thanks for the update, @csaket - did you use the "standard" format or "mattermost" with the slack_viewer?

csaket commented 2 years ago

@rusq - I have been using the same export from slackdump from the start so I guess that is the standard format.

OevreFlataeker commented 2 years ago

+1 for the requested feature! Just playing around with the tool. I see that in "standard" mode the images are tried to be loaded from (using slack-export-viewer.exe) and get 404. When I export in mattermost mode the original URLs to files.slack.com are preserved but not loaded with the viewer due to CORS.

I'd also like to get a FULL offline copy including all the attachments and everything.

Commandlines used to create the dumps: .\slackdump.exe -export myarchive.zip -export-type mattermost -download Cxxxx or .\slackdump.exe -export myarchive.zip -export-type standard -download Cxxxx

View via: slack-export-viewer.exe -p 9999 -z .\myarchive.zip

rusq commented 2 years ago

@OevreFlataeker thanks for the feedback, I'll see what I can do.

I just want to mention that export files are already FULL, if you enable file download, all files are getting downloaded into attachments subdirectory (standard) or __upload subdirectory (for Mattermost). The reason slack-export-viewer does not show them is because it does not expect to see files in the archive, as it was built for the export files generated by Slack.

OevreFlataeker commented 2 years ago

Thanks for pointing that out! Those suggestions in https://github.com/rusq/slackdump/issues/111#issuecomment-1229445740 look great! Hope you can get it working! Thanks!

ianbmacdonald commented 1 year ago

Is the safest bet to use standard, in hopes that this milestone may lead to a simple way to view attachments in old exports? Is there a fork of a viewer anywhere that allows the attachments to load. I am just wondering what I should be using today, to make this easiest down the road.

dmccarthy4 commented 1 year ago

Found an alternative viewer that I have been able to modify and display the downloaded images inline. https://github.com/bsimpson/slack_viewer Have not tried the thumbnails yet.

I couldn't get this viewer working. What kind of modification do I need to do?