zulip / python-zulip-api

Python library for the Zulip API.
https://zulip.com/api/
Apache License 2.0
355 stars 358 forks source link

[WIP] Enhanced matrix bridge. #723

Closed ro-i closed 2 years ago

ro-i commented 3 years ago

Hi :)

As discussed here, I'm now creating the pull request for the new Matrix <-> Zulip bridge. See here for further discussions about uploading an in-memory file-object to a Zulip server.

Feedback would be greatly appreciated! :)

rht commented 3 years ago

Also reminder to update https://zulip.com/integrations/doc/matrix once this PR is merged.

rht commented 3 years ago

There are some Mypy errors in the CI.

ro-i commented 3 years ago

Yeah, I'm about to fix them. Weirdly, I cannot reproduce them with tools/lint or tools/run-mypy...

ro-i commented 3 years ago

There are some Mypy errors in the CI.

static analysis is fixed now - I didn't realize that we are stuck to python 3.6 :eyes:

timabbott commented 2 years ago

It'd be great to rebase this and get CI passing on it.

zulipbot commented 2 years ago

Heads up @ro-i, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/main branch and resolve your pull request's merge conflicts accordingly.

ro-i commented 2 years ago

Yeah, I try to look into this again this week ^^ I remember having one or two issues we might need to tackle, but let's see :)

ro-i commented 2 years ago

[Sorry for the delay :eyes: Conflicts are resolved, now I need to cleanup a bit :) ]

ro-i commented 2 years ago

Now, I need to figure out, why the bridge needs a bit too much CPU when it actually should be idling... :thinking:

timabbott commented 2 years ago

@ro-i do you have time to update this? It's be nice to try to get this integrated.

rht commented 2 years ago

I can take over updating the PR if needed.

zulipbot commented 2 years ago

Heads up @ro-i, we just merged some commits that conflict with the changes you made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/main branch and resolve your pull request's merge conflicts accordingly.

ro-i commented 2 years ago

Hi :) I'm sorry that I forgot to update this. :see_no_evil: This semester, I have been rather busy (after august 15, which is the deadline of my bachelor's thesis, this will improve :) ). However, I would also love to get this finally merged! :tada:

Have some of you tried the new bridge for yourselves? Because the problem I was still struggling with has been the CPU load. When the bridge should be idling (at least in my opinion), it actually has a CPU load of 10% to 13% (observed using top with an interval of 1s). Did you experience this, too? I once asked about this on the nio matrix channel (on October 1, 2021 actually ^^). At this time, the CPU load was even higher (~30%). But I didn't manage to resolve this. I'm not sure why the CPU load has improved. However, I still think that it is too much. What do you think?

rht commented 2 years ago

I once asked about this on the nio matrix channel (on October 1, 2021 actually ^^)

What did they say about it? If it is nio's perf problem, there is nothing we can do about it, and shouldn't be a blocker to this PR getting merged.

ro-i commented 2 years ago

They told me to use a profiler, which I did: Screenshot from 2022-07-28 09-57-35

Then they said nothing more :thinking:

ro-i commented 2 years ago

However, my code changed since then. So maybe I should give it another try?

ro-i commented 2 years ago

Just to be sure that I did not mess up with asyncio :thinking:

rht commented 2 years ago

More effective to use this lib instead: https://pypi.org/project/line-profiler/ gives line-by-line profile, and has been used in investigating Zulip code, e.g. the Markdown parsing.

ro-i commented 2 years ago

Hm, that sounds cool, thanks! However, I seem to not be able to get it to work:

$ kernprof -lv matrix_bridge.py -c ~/matrix_bridge.conf 
Starting Zulip <-> Matrix mirroring bot
Starting message handler on Matrix client
Starting message handler on Zulip client
^C

kernprof does not generate any output. Do I overlook something?

rht commented 2 years ago

You have to decorate the function you want to profile with @profile, and then run this

kp () {
    kernprof -v -l $1 > profile.txt
}
kp your_python_file.py
ro-i commented 2 years ago

I did, but it didn't work. Probably because I need to terminate the script externally or per exception... Or the profiler cannot deal with async code?

ro-i commented 2 years ago

Hm, and somehow, the requirements of the matrix bridge from requirements.txt do not get installed during the setup phase :thinking: (That is why the tests fail...)

rht commented 2 years ago

Async should work https://github.com/rkern/line_profiler/issues/103.

rht commented 2 years ago

I think it's fine to add matrix-nio in https://github.com/zulip/python-zulip-api/blob/main/requirements.txt.

rht commented 2 years ago

That requirements.txt is for dev-purpose only.

ro-i commented 2 years ago

Async should work rkern/line_profiler#103.

I got it working. I just needed to make sure that I can trigger the termination without exit (so that the main function just returns). I did not yet see any obvious issues, though...

Total time: 0.001073 s
File: ./matrix_bridge.py
Function: run at line 208

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   208                                               @profile
   209                                               async def run(self) -> None:
   210         1         16.0     16.0      1.5          print("Starting message handler on Matrix client")
   211                                           
   212                                                   # Set up event callback.
   213         1          8.0      8.0      0.7          self.matrix_client.add_event_callback(self._matrix_to_zulip, nio.Event)
   214                                           
   215         1       1049.0   1049.0     97.8          await self.matrix_client.sync_forever()

Total time: 0.001328 s
File: ./matrix_bridge.py
Function: run at line 394

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   394                                               @profile
   395                                               async def run(self) -> None:
   396         1          9.0      9.0      0.7          print("Starting message handler on Zulip client")
   397                                           
   398         1          2.0      2.0      0.2          self.loop = asyncio.get_event_loop()
   399                                           
   400         2        177.0     88.5     13.3          with ThreadPoolExecutor() as executor:
   401         2       1138.0    569.0     85.7              await asyncio.get_event_loop().run_in_executor(
   402         1          2.0      2.0      0.2                  executor, self.zulip_client.call_on_each_message, self._zulip_to_matrix
   403                                                       )
ro-i commented 2 years ago
Total time: 0.124969 s
File: ./matrix_bridge.py
Function: _matrix_to_zulip at line 95

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    95                                               @profile
    96                                               async def _matrix_to_zulip(self, room: nio.MatrixRoom, event: nio.Event) -> None:
    97         2         66.0     33.0      0.1          logging.debug("_matrix_to_zulip; room %s, event: %s" % (str(room.room_id), str(event)))
    98                                           
    99                                                   # We do this to identify the messages generated from Zulip -> Matrix
   100                                                   # and we make sure we don't forward it again to the Zulip stream.
   101         2          9.0      4.5      0.0          if event.sender == self.matrix_config["mxid"]:
   102         1          1.0      1.0      0.0              return
   103                                           
   104         1          1.0      1.0      0.0          if room.room_id not in self.matrix_config["bridges"]:
   105                                                       return
   106         1          1.0      1.0      0.0          stream, topic = self.matrix_config["bridges"][room.room_id]
   107                                           
   108         1         19.0     19.0      0.0          content: Optional[str] = await self.get_message_content_from_event(event, room)
   109         1          1.0      1.0      0.0          if not content:
   110                                                       return
   111                                           
   112         1          1.0      1.0      0.0          try:
   113         2     123569.0  61784.5     98.9              result: Dict[str, Any] = self.zulip_client.send_message(
   114         1          1.0      1.0      0.0                  {
   115         1          1.0      1.0      0.0                      "type": "stream",
   116         1          0.0      0.0      0.0                      "to": stream,
   117         1          0.0      0.0      0.0                      "subject": topic,
   118         1          1.0      1.0      0.0                      "content": content,
   119                                                           }
   120                                                       )
   121                                                   except Exception as exception:
   122                                                       # Generally raised when user is forbidden
   123                                                       raise Bridge_FatalZulipException(exception)
   124         1          2.0      2.0      0.0          if result["result"] != "success":
   125                                                       # Generally raised when API key is invalid
   126                                                       raise Bridge_FatalZulipException(result["msg"])
   127                                           
   128                                                   # Update the bot's read marker in order to show the other users which
   129                                                   # messages are already processed by the bot.
   130         2       1294.0    647.0      1.0          await self.matrix_client.room_read_markers(
   131         1          2.0      2.0      0.0              room.room_id, fully_read_event=event.event_id, read_event=event.event_id
   132                                                   )

Total time: 0.124933 s
File: ./matrix_bridge.py
Function: _zulip_to_matrix at line 258

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   258                                               @profile
   259                                               def _zulip_to_matrix(self, msg: Dict[str, Any]) -> None:
   260         2         74.0     37.0      0.1          logging.debug("_zulip_to_matrix; msg: %s" % (str(msg),))
   261                                           
   262         2          4.0      2.0      0.0          if msg["content"] == "exit":
   263         1          3.0      3.0      0.0              raise Bridge_FatalZulipException()
   264                                           
   265         1          4.0      4.0      0.0          room_id: Optional[str] = self.get_matrix_room_for_zulip_message(msg)
   266         1          1.0      1.0      0.0          if room_id is None:
   267                                                       return
   268                                           
   269         1          1.0      1.0      0.0          sender: str = msg["sender_full_name"]
   270         2          5.0      2.5      0.0          content: str = MATRIX_MESSAGE_TEMPLATE.format(
   271         1          1.0      1.0      0.0              username=sender, uid=msg["sender_id"], message=msg["content"]
   272                                                   )
   273                                           
   274                                                   # Forward Zulip message to Matrix.
   275         2     123735.0  61867.5     99.0          self._matrix_send(
   276         1          0.0      0.0      0.0              room_id=room_id,
   277         1          0.0      0.0      0.0              message_type="m.room.message",
   278         1          1.0      1.0      0.0              content={"msgtype": "m.text", "body": content},
   279                                                   )
   280                                           
   281                                                   # Get embedded files.
   282         3        167.0     55.7      0.1          files_to_send, media_success = asyncio.run_coroutine_threadsafe(
   283         1         28.0     28.0      0.0              self.handle_media(msg["content"]), self.loop
   284         1        895.0    895.0      0.7          ).result()
   285                                           
   286         1          7.0      7.0      0.0          if files_to_send:
   287                                                       self._matrix_send(
   288                                                           room_id=room_id,
   289                                                           message_type="m.room.message",
   290                                                           content={"msgtype": "m.text", "body": "This message contains the following files:"},
   291                                                       )
   292                                                       for file in files_to_send:
   293                                                           self._matrix_send(room_id=room_id, message_type="m.room.message", content=file)
   294         1          7.0      7.0      0.0          if not media_success:
   295                                                       self._matrix_send(
   296                                                           room_id=room_id,
   297                                                           message_type="m.room.message",
   298                                                           content={
   299                                                               "msgtype": "m.text",
   300                                                               "body": "This message contained some files which could not be forwarded.",
   301                                                           },
   302                                                       )
rht commented 2 years ago

I suppose zulip_client.send and _matrix_send are slow because of the network request. And it's unlikely that they contribute to the CPU load. Protip: you can also add @profile in the installed site-packages files of matrix-nio itself if you want to get the line profiling.

Anyway, I think we should discuss about the performance elsewhere, so that we can focus on getting this PR merged.

Everything else other than this comment (the message might be less concise with the extra ID, but I have no strong opinion on this) LGTM.

ro-i commented 2 years ago

Ok :) I think it would be good to have something unique to identify the users from both sides. The matrix id, which is shown on the Zulip side, is of course a bit nicer than the Zulip user id shown on the Matrix side. But the internal Zulip email address wouldn't make much sense either, so I think it's not bad :)

timabbott commented 2 years ago

This looks great, merged, thanks @ro-i and @rht!

I think we can wait for feedback from users on the precise formatting; it seems possible it's overly verbose in a way that's annoying, but I'm not really sure. Perhaps you can post some screenshots in #integrations to get more eyes on it.

(The CI is an infrastructure failure on a Windows Actions node).

The one thing that seems like a natural follow-up to me is that it'd probably be better if the Matrix bridge were a separate zulip-bridge-matrix package. It'd still be in this codebase, and built as part of the same source bundle, just be another output, like the zulip_botsserver package.

andersk commented 2 years ago

(The CI is an infrastructure failure on a Windows Actions node).

Unfortunately, this was not just an infrastructure failure. I’ve had to revert the last commit in #763 to restore passing CI.

@ro-i When you get a chance, please open a new pull request restoring this commit. The failure on Windows will need to be debugged before it can be merged.

ro-i commented 2 years ago

Will do as discussed here!