element-hq / element-web

A glossy Matrix collaboration client for the web.
https://element.io
GNU Affero General Public License v3.0
11.05k stars 1.97k forks source link

Feature request: export chat logs #2630

Closed sunnyolives closed 3 years ago

sunnyolives commented 7 years ago

A very useful feature. I find it strange it isn't already implemented, unless I am bad at searching.

Alternatives

What do you think?

ara4n commented 7 years ago

yup, i find myself wanting this repeatedly too. #2129 is close to it, but would be nice to just be able to say "download room as log".

Half-Shot commented 7 years ago

I was once toying with the idea of selecting some of the chat, and hitting some kind of key combo/ menu option to copy a IRC style log into the clipboard?

ara4n commented 7 years ago

This would also be cool!

Nadrieril commented 7 years ago

Any plans to do this ?

ara4n commented 7 years ago

as a p2 feature req this is basically stuck behind everything tagged as p1, which is around 200 issues right now. but given this is FOSS anyone is welcome to contribute it!

clopez commented 7 years ago

I find the search feature of matrix/riot pretty inefficient for an advanced user like me that is used to do greps (even using regexs) over simple text files.

So, being unable to extract simple plain text logs from my matrix conversations (both private or in rooms) is a serious regression for me, when compared with my previous experience with Jabber or IRC clients.

ara4n commented 7 years ago

https://gitlab.com/argit/matrix-recorder is a good workaround for this for now (and even does e2e if you desire, albeit storing it unencrypted). We could also build something like this into riot itself in future.

MaskyS commented 6 years ago

+1 for this feature request

ghost commented 6 years ago

I also need to inform that two my accounts at gmail and yahoo been locked and i have no access to them long time ago. I also worry that my citizen auth key was stored at that acc. Maybe anyone could help if you have my chats and so powerfullm

scottAnselmo commented 6 years ago

Being able to export IRC-like logs would be an extremely useful feature to have for quickly copying chat logs for reoccuring FOSS project meetings to archive somewhere. Glad to see it's now labelled p1 instead of p2.

Valodim commented 6 years ago

I miss this feature a lot and I'm tempted to give this a go. However the riot-web code base looks very daunting, especially with the weird separation of sdk and actual app. Can someone who has an overview of the code base give a rough plan of the relevant cutpoints where this would be implemented in riot-web?

t3chguy commented 6 years ago

@Valodim all the code would be in https://github.com/matrix-org/matrix-react-sdk/

ilu33 commented 6 years ago

This feature is also required by GDPR. I think GDPR favours structured formats, json and the like, but I don't mind whether it's txt or anything else. But besides the legal stuff it would be a really really useful feature. Copying the history manually is a p.i.t.a.. Also GDPR and other valid reasons convince server operators to delete older history (we are discussing something in between 3-6 month atm). So whatever is not copied gets lost. So if @Valodim picks this up it would be great.

t3chguy commented 6 years ago

@ilu33 https://matrix.org/docs/projects/other/matrix-recorder.html riot-web doesn't store the data so it doesn't have to be able to export it

ilu33 commented 6 years ago

Nobody requires riot-web to store data. riot-web is just the user frontend and it pulls data from the hs all the time. If the user wants to store selected data the user frontend should provide a way to do so.

Using another tool is a crutch at best. How would the average enduser even install matrix-recorder? Not to mention https://gitlab.com/argit/matrix-recorder/issues/1 which makes the tool unusable for everybody who's in several high traffic rooms (which no sane person would want to archive). Also routinely archieving multi-user rooms without any reason or need to do so (just because the tool happens to be catch-all) defies every purpose of data protection. It might be allowed as long as you don't publish the data but it sounds immoral to me. I would not want to do that.

And regarding encryption: Matrix has already problems handling E2EE if more than one device is involved. It regularly breaks for no obvious reasons (issues are up here on github). I would not want to recommend using another device to anybody.

t3chguy commented 6 years ago

Yet on the other hand you run into much bigger limitations in webapps in the sizes of files you can generate before the browser kills you

ilu33 commented 6 years ago

hm - limit the amount of lines? Maybe only export the stuff riot-web has already loaded into the browser window?

It seems that matrix-recorder has a similar problem (ORG.MATRIX.JSSDK_TIMEOUT)?

Note @t3chguy : I edited my previous comment.

makedir commented 6 years ago

Why is there nothing done since ticket open of Nov 22, 2016? The Riot client fronted should have a simple user-friendly basic chat export for admins, just copy chat from 1.1.2018 to 1.2.2018 in channel x to clipboard. It is also a stubborn and ridiculous excuse to say, client, doesn't need to have this feature because of limitations of size. Seriously? We are speaking of chat logs, not of file attachments. If you copy a chat log for a day or week its like 10-1000kb in size.

Nadrieril commented 6 years ago

@makedir It does indeed seem like a very useful feature, but please keep it civil. Developers have their own priorities. If you believe it is not very hard, why not give it a go ? Contributing to a super cool project like Riot is a fun experience !

psaavedra commented 6 years ago

JFYI, @thiblahute was developed a simple but useful command line for this: https://gitlab.gnome.org/thiblahute/matrix-dl

In order to install matrix-dl without messing our system's Python setup, we will document how to install it using a virtual environment. The command virtualenv is provided by a Python package that enable the creation of these virtual environments. You can install virtualenv using you packaging system.

Run:

virtualenv -p python3 matrix
cd matrix
source bin/activate

Now, clone the code:

git clone https://gitlab.gnome.org/thiblahute/matrix-dl.git

And install the dependencies and the script itself in the virtual enviornment:

cd matrix-dl
python setup.py install

Usage:

The tool's usage instructions are these:

matrix-dl [-h] [--password PASSWORD] [--matrix-url MATRIX_URL]
          [--start-date START_DATE]
          username room

Download backlogs from Matrix as raw test

positional arguments:
  username
  room

optional arguments:
  -h, --help               show this help message and exit
  --password PASSWORD      Will be asked later if not provided
  --matrix-url MATRIX_URL
  --start-date START_DATE  format %d%m%Y

A couple examples:

Let's download the conversations from Example channel since the beginning of 2018:

matrix-dl --matrix-url https://matrix.example.com/ --start-date 01012018 \
  <fsurname> "Example" > example-2018.log

Then you will be asked for you password, and if there is no errors, the conversations will be dumped in the file example-2018.log with the format hh:mm:ss — @user: message

You can also dump conversation from unnamed rooms, such as personal conversation, you just need the room's internal ID. You can get this string in riot by clicking in the room's settings icon (the gear so far) and at the end of the settings, in the advanced section, there's the room's ID:

matrix-dl --matrix-url https://matrix.example.com/ --start-date 01012018 \
  <fsurname> \!i4BiDaYPkvfbcWdAgb:example.com > my-chat-2018.log

Remember to escape the symbol !, otherwise the shell may consider it an operator.

Eldovar commented 6 years ago

What Thibault @thiblahute has programmed is for sure useful, but only for programmers who know how to install and deal with that software.

An export of the chatlog to simple text is essential for example for group meetings, to be able to write the minutes after the meeting. Therefore: Please, please, please implement a possibility to export the chatlog to a simple text file. Thank you very much.

ilu33 commented 6 years ago

Any progress on this? "manually running something like select * from events where user=?" as suggested here https://matrix.org/blog/2018/05/08/gdpr-compliance-in-matrix/ is not really a solution. I can't even find this issue in the GDPR project timeline.

lapineige commented 5 years ago

JFYI, @thiblahute was developed a simple but useful command line for this: https://gitlab.gnome.org/thiblahute/matrix-dl

@psaavedra is it working on encrypted channels too ?

Thanks for this tool BTW.

Please, please, please implement a possibility to export the chatlog to a simple text file.

Allowing to do this in a convenient way for every users sounds like an important option to me - at least for backup before deleting the history, or exporting to other channel, or searching in an encrypted channel (as it not working right now), and so on.

Grg-M commented 5 years ago

@thiblahute, @psaavedra when I try to run the command line tool to dump log for unencrypted named rooms I receive this error: raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

instead for encrypted rooms I receive this error: event not found any suggestion on how to debug these?

Being able to export the logs would be extremely useful.

Nigamanth96 commented 5 years ago

@psaavedra , I do not understand what the matrix URL is. Can you please clarify what matrix.example.com is? I mean, is the 'example' in the URL the channel name? I get a lot of exceptions while doing the same but replacing the 'example' with the channel I've created. BTW there is nothing called 'channel', but only rooms right?

Nigamanth96 commented 5 years ago

image I get an exception like this.

Nigamanth96 commented 5 years ago

I've got no idea what URL is the right one

JimmyCushnie commented 5 years ago

Really sorry if this is the wrong place to ask for help, but it is absolutely critical that I download a Matrix chat room and I don't know where else to go. I'm trying to use @thiblahute's tool, like so:

python matrix-dl --matrix-url https://riot.im/ [my username] "[room name]"

then I enter my password. Then I get this error:

[my username] connecting to https://riot.im/
Traceback (most recent call last):
  File "matrix-dl", line 164, in <module>
    getter.run()
  File "matrix-dl", line 80, in run
    password=self.password)
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\client.py", line 249, in login_with_password
    return self.login(username, password, limit, sync=True)
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\client.py", line 270, in login
    "m.login.password", user=username, password=password, device_id=device_id
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\api.py", line 160, in login
    return self._send("POST", "/login", content)
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\api.py", line 691, in _send
    code=response.status_code, content=response.text
matrix_client.errors.MatrixRequestError: 404: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /_matrix/client/r0/login was not found on this server.</p>
</body></html>

What am I doing wrong?

aaronraimist commented 5 years ago

In the future you should open an issue on that repo https://gitlab.gnome.org/thiblahute/matrix-dl, not here. I assume the problem is Riot is not a matrix server. The server is the bit at the end of your username. So if my account was @aaron:matrix.org, https://matrix.org would be the server or as they call it matrix-url.

No need to reply thanks if that fixed it. If it didn’t go open an issue on https://gitlab.gnome.org/thiblahute/matrix-dl.

grahamperrin commented 5 years ago

… backup before deleting the history, …

Or (edge case) before rejoining a room from which you have been kicked through no fault of your own.

An example of history lost after rejoining: https://github.com/matrix-org/synapse/issues/2212#issuecomment-487407191

End-to-End-is-the-way commented 5 years ago

I also would love to see this feature...

matheusfillipe commented 5 years ago

Is there any way to do that? I tried what's on https://gitlab.gnome.org/thiblahute/matrix-dl but doesn't work for me. I opened a issue there but seems like development there is inactive.

makedir commented 5 years ago

Is there any way to do that? I tried what's on https://gitlab.gnome.org/thiblahute/matrix-dl but doesn't work for me. I opened a issue there but seems like development there is inactive.

the method works fine if you do it correctly, it is really annoying though. and the matrix-dl client doesnt export all ASCII characters correctly. I asked the dev about it and he never wrote back.

ara4n commented 5 years ago

i’ve been thinking about this while implementing a smarter clipboard for riot-web. I realised i don’t actually necessarily understand what the 63 people upvoting are after here. is it:

  1. 👍 - ability to export logs from Riot/Web for a given room(s) as a big lump of static HTML, suitable for printing or grepping or sharing out of band? This is hard, given public rooms can easily have millions of messages, and you would probably run out of RAM (or bandwidth or time) trying to export them. But we could implement it with a date range or msgcount limit.
  2. ❤️ - ability for Riot/Desktop to save all the messages it sees on disk as HTML or plain text, a bit like an IRC client would, spidering to fill in any gaps in the logs?
  3. 🚀 - actually, pantalaimon / matrix-recorder / matrix-dl etc actually have solved this already for my use case.

Seshat may help with option 2 in the near future. Meanwhile, i may be able to tweak the clipboard code to easily support option 1.

Please vote by upvoting this msg with the matching emoji so I can get more of a handle.

matheusfillipe commented 5 years ago

i’ve been thinking about this while implementing a smarter clipboard for riot-web. I realised i don’t actually necessarily understand what the 63 people upvoting are after here. is it:

1. +1 - ability to export logs from Riot/Web for a given room(s) as a big lump of static HTML, suitable for printing or grepping or sharing out of band? This is hard, given public rooms can easily have millions of messages, and you would probably run out of RAM (or bandwidth or time) trying to export them. But we could implement it with a date range or msgcount limit.

2. heart - ability for Riot/Desktop to save all the messages it sees on disk as HTML or plain text, a bit like an IRC client would, spidering to fill in any gaps in the logs?

3. rocket - actually, pantalaimon / matrix-recorder / matrix-dl etc actually have solved this already for my use case.

Seshat may help with option 2 in the near future. Meanwhile, i may be able to tweak the clipboard code to easily support option 1.

Please vote by upvoting this msg with the matching emoji so I can get more of a handle.

What would be perfect to me is option 2 :heart:

I wasn't able to check those you mentioned in option 3 but I will try it out now.... Still riot could maybe has such option.

makedir commented 5 years ago

Isnt it obvious what people want? A simple per chat room export button with start date and end date, export all chat messages from start date to end date and give it as html or txt or zip file.

JimmyCushnie commented 5 years ago

I want to export personal chats with only thousands to tens of thousands of messages, so running out of ram isn't a concern for me. But even if it was, "you have to use disk as ram to export chats" is better than "you can't export chats at all"

ilu33 commented 5 years ago

No 1 :+1: , per channel, with start and end date and probably a reasonable msgcount limit. If the room is big or old, people would have to maneuver around this limit by selecting periodic chunks.

Maybe give format options, xml or json would work too. And thanks for coming back to this.

rubo77 commented 5 years ago

No 1 👍, but it Should be possible to select multiple rooms at once

melyux commented 4 years ago

Why not both? Export on demand, and also save all logs/events to disk as plaintext logs for posterity.

1ykos commented 4 years ago

I just noticed I spammed Matrix with lots of pictures (1.5MB each) and would like to help you clean up, but scrolling up takes around 10 minutes (I used a heavy item on the pgup). It would be so great if this process was easier, that is exporting and deleting chatrooms.

arthurlutz commented 4 years ago

Does the integration of search through seshat which claims to support E2EE rooms https://github.com/vector-im/riot-web/pull/11125 change this feature request ? Could we use the indexed data collected there ?

martindale commented 4 years ago

Allow for import as well! 👍

arthurlutz commented 4 years ago

@martindale indeed. this would help unload the high traffic on matrix.org by enabling a migration to another (self-hosted?) server.

aaronraimist commented 4 years ago

@arthurlutz I assume you are aware but you can already migrate to another server (that’s kind of the whole point of Matrix, that no one server owns a room). Just join the room from your new server. If you want the full history you can just scroll up in the room and then your new server will have a full copy of the room. (Yes there should probably be a button in Riot that performs both steps, joins a room and requests the full history. I just filed https://github.com/vector-im/riot-web/issues/12766 for that.)

KopfKrieg commented 4 years ago

If you want the full history you can just scroll up in the room and then your new server will have a full copy of the room.

That depends on your room settings and is not always possible. But thanks for filing the issue, hope it can be resolved. AFAIK the plan is to use a UUID derived from a private key as identifier (instead of username@server), and the server just being something you can change as you wish.

protist commented 4 years ago

Has anyone actually managed to export E2E encrypted chats successfully? With COVID-19, there are many of us working from home now, and presumably many looking for new chat platforms. Without exported chat logs, Matrix/Riot are not viable options for me.

I've checked out the existing export options. matrix-dl doesn't appear to export encrypted rooms. I can't even get matrix-recorder to build, and it looks like it may have been abandoned anyway. pantalaimon was mentioned above, but it's unclear to me how to export chat logs with this.

JimmyCushnie commented 4 years ago

I tried a number of approaches several months ago but I ended up giving up. If it's presently possible to export E2E chats, it is really really difficult to do so and it needs to be more accessible.

protist commented 4 years ago

Thanks @JimmyCushnie. I might have to go to a closed-source platform for now then.

lapineige commented 4 years ago

Will the future "daemon" (I don't know if it's the correct name) that will allow to search an encrypted channel be able to export chat logs ? I thought I've seen that information somewhere, yet I can't find it.