Open hab25 opened 6 months ago
That, I'm afraid, would never be possible with the simple approach used in the package. Atomic-chrome works that way by keeping the communication channel between the browser and the editor. exwm-edit is made to work not only with the browser, but also with any text input in any app.
I don't know of an easy way to track the exact text element in any app and sync the text in it with a different source. I'm afraid this feature may not be easily possible.
track the exact text element in any app
Note that bidirectionality (like atomic-chrome) is not a requirement here. Would such tracking be necessary if we only want to continuosly update from Emacs to the app, and not vice-versa?
Bidirectionallity is not even touched at all. Even before thinking of that, consider that the automation would require tracking specific input elements.
Even when constrained to X11 use, with Xlib or xcb (low-level libraries for interacting with X11), you could potentially find text input fields by looking for windows with certain properties or attributes, but this would likely involve a good deal of trial-and-error and might not work reliably across different applications.
That's why atomic-chrome works nicely with a browser, because it is meant to work only with a browser. The closest one can get is probably by seeking some compatibility with the idea implemented in atomic-chrome - if the original app is the browser, use one approach, if not, then the simpler approach of copy-n-pasting. However, I bet that it still is not going to be extremely reliable and would make the package more complex, plus would require to support the browser extension.
Unfortunately, X11 doesn't have a standardized way of interacting with text within another application's text fields or tracking and identifying those fields.
To make matters more challenging, an increasing number of Linux distributions are moving away from X11 to Wayland, which has even stricter security restrictions against applications interacting with each other's windows. That isn't a point of concern for this particular project, I'm just trying to illustrate why this idea of yours would be pretty hard to pull off.
Sure, one can get very inventive and ask: "Why can't we use something similar to xdotool to update the text asynchronously?". The problem becomes obvious once you realize that xdotool (or similar tools) don't track specific input elements due to limitations and security concerns I described above. With xdotool, it's even difficult to find the exact window when there are multiple. Imagine trying to type for the "post comment" text element, but your text goes into the address bar of your browser, and not even in the window you wanted.
It made me thinking if it is possible to track the coordinates of where the caret was before invoking emacsclient? Unfortunately, this isn't easily feasible either.
The main issue is that the caret is a software construct that's internal to each application, and X11 doesn't necessarily know anything about it. The text caret is typically drawn by the application (or the GUI toolkit the application uses), and its position doesn't necessarily correspond to anything that X11 knows about.
In most cases, the application doesn't tell X11 where the caret is; it just tells X11 what to draw (i.e., it sends drawing commands to draw the text characters, the caret, etc), and X11 doesn't keep track of what each drawing command represents. In addition, the drawing commands that the application sends to X11 are often at a lower-level of abstraction than individual widgets or carets (e.g., drawing commands could be at the level of lines, rectangles, or bitmaps), and the details can vary widely between different applications and GUI toolkits.
In theory, a carefully crafted xcb or xlib program might be able to infer the caret position by watching for specific X11 events or properties in specific applications, but this would likely require deep knowledge of the inner workings of those applications and their GUI toolkits, and it may not be feasible or reliable in a general case, across different applications.
In theory, you could record the position of the mouse when the user is typing (i.e., force the mouse into the position where the caret is), and then use xdotool or a xcb/xlib/EXWM call to move the mouse back to those coordinates, and simulate a click. This may set the focus to the text field under the mouse (assuming the focus follows the mouse).
However, there are few potential issues you may face:
User movement: If the user moves the application's window, scroll the content or changes the size of the window and you later move the mouse back to the recorded coordinates and click, it might not click on the intended text field, or possibly not even that application's window.
GUI Redraws: GUI elements might vary due to a variety of factors, such as screen resolution, window size, or other GUI events like a pop-up or a dialog box appearing.
System Compatibility: Different desktop environments or window managers handle mouse and focus behavior differently, so there would be variations on different systems.
User Disruption: Depending on the design of your program, the act of moving the mouse could disrupt what the user is currently doing.
Hopefully you can see how this "simple" task of synchronizing text between Emacs and an input field can quickly get out of hand. But I'd like to remain skeptical of my own skepticism and try to find a way to achieve that. Even if we never find a solution, I'm sure some good ideas would emerge. Let's keep brainstorming on this.
Interesting ideas!
In most cases, the application doesn't tell X11 where the caret is; it just tells X11 what to draw (i.e., it sends drawing commands to draw the text characters, the caret, etc), and X11 doesn't keep track of what each drawing command represents.
Very subjective point: IMO the focus of all new code should be Wayland; X11 is almost entirely a waste of time. Given X11's deprecation, I expect the overwhelming majority users will be transitioning away from it in the next coming years, making the expected value of such code very low.
However, there are few potential issues you may face:
I agree, and to me these issues make the usability of such a solution very low.
I am very unfamiliar with both the exwm
and exwm-edit
codebases, but when opening this issue I expected that it could be implemented by simply modifying the current mechanism to "run in a loop": instead of the user calling exwm-edit--finish
interactively, put it in the buffer-local value of 'after-change-functions
along with a subsequent call to #'exwm-edit--compose
, which restarts the loop. Probably requires slight modification of existing functions to adjust the current buffer and avoid killing the exwm-edit buffer.
If the idea isn't clear, hopefully the following helps. Treat is as untested pseudocode; I'm not sufficiently proficient with the exwm-edit
code-base:
(add-hook
'exwm-edit-mode-hook
(lambda ()
(add-hook
'after-change-functions
(lambda (&rest _)
;; `exwm-edit--finish` probably needs to be modified to not kill the exwm-edit buffer (or alternatively, create a new `#'exwm-edit--send` that is identical except that it doesn't call `#'kill-buffer`)
(exwm-edit--finish)
;; not doing `(exwm-edit--compose)` directly to avoid recursion, which could be confusing to `#'quit` out of and could grow the call stack too large
(run-with-timer 0 nil
;; note: this must be run with the buffer containing app being edited current. If necessary, simply and appropriately wrap with `with-current-buffer`. It may be necessary to create a new `exwm-edit--last-buffer` variable as calling `(set-window-configuration exwm-edit--last-window-configuration)` is probably too visually distracting to be called through `after-change-functions`
#'exwm-edit--compose))
nil
t)))
I think the only caveat with this solution (I could be wildly wrong about this) is that it does not support the "if you already have something pre-selected" case; with most apps content pasted by C-v
won't be selected and so exwm-edit--compose
will do a C-a
in the app.
Like https://github.com/alpha22jp/atomic-chrome.
It's useful when the text area being written to can use the "partial text", e.g. in something like the google chrome omnibar where suggestions are shown beneath it as you type.