firecat53 / urlscan

Mutt and terminal url selector (similar to urlview)
GNU General Public License v2.0
214 stars 38 forks source link

Cannot use urlscan in a non-interactive session #115

Closed typoon closed 3 years ago

typoon commented 3 years ago

When trying to use urlscan in a non-interactive session (via a tmux plugin or rofi) the following error is thrown:

Traceback (most recent call last):
  File "/home/gilgamesh/code/urlscan/bin/urlscan", line 191, in <module>
    main()
  File "/home/gilgamesh/code/urlscan/bin/urlscan", line 166, in main
    msg = process_input(args.message)
  File "/home/gilgamesh/code/urlscan/bin/urlscan", line 140, in process_input
    close_stdin()
  File "/home/gilgamesh/code/urlscan/bin/urlscan", line 100, in close_stdin
    fdesc = os.open('/dev/tty', os.O_RDONLY)
OSError: [Errno 6] No such device or address: '/dev/tty'

It can be reproduced by creating a script in your home directory like this:

#!/usr/bin/env bash

items=$(echo "https://www.google.com" | urlscan -d -n -c > /tmp/log.txt 2>&1)
echo $items

Run that script through rofi for example (or any way you have of running it non-interactively) and check the contents of /tmp/log.txt

I have written a fix for this issue that I will submit the PR in a few minutes.

Boruch-Baum commented 3 years ago

I don't understand how your example reproduces the error that you're reporting. I can and do use urlscan fine from a tmux plugin. Can you give an example of piping it into rofi? For your rofi case, I'm stuck not able to pipe any urls, but that's because I don't see an appropriate rofi "mode", not because of any error in urlscan.

More fundamentally, I don't understand why you're even trying to run urlscan non-interactively. The whole point of it is to be an interactive command! Any attempt to use urlscan non-interactively will have to perform "double-work" in parsing the urls out of the urlscan output because after urlscan itself parses the urls, it outputs much more than the raw urls. That's true even when option --compact is used.

typoon commented 3 years ago

It is not piping it through rofi, it is literally running it through rofi. Open rofi and type the whole path to your shellscript in it. You will be able to reproduce it like that.

The use case is passing the output of urlscan to fzf when being used from the context of a tmux plugin.

Try this plugin I created: https://github.com/typoon/tmux-urlscan-fzf

Without the patch I submitted it does not work, because urlscan is ran non-interactively.

Hopefully this clears things out. Let me know if it still does not make sense.

Boruch-Baum commented 3 years ago

On 2021-05-28 08:58, Henrique M. D. wrote:

It is not piping it through rofi, it is literally running it through rofi. Open rofi and type the whole path to your shellscript in it. You will be able to reproduce it like that.

Ah. I missed that.

Hopefully this clears things out. Let me know if it still does not make sense.

It still does not make sense to me as a problem in urlscan. For years I've been using a different tmux plugin[1] in conjunction with urlscan, and it has always worked fine, without the error that you are reporting. See there for how it sends data from the tmux buffer to urlscan[1]. I have the plugin configured in my .tmux.conf as follows:

set -g @plugin 'tmux-plugins/tmux-urlview'
set -g @urlview-extractor 'urlscan -d -w 80 -W '

[1] https://github.com/tmux-plugins/tmux-urlview

-- hkp://keys.gnupg.net CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0

typoon commented 3 years ago

That plugin saves the content from the buffer into a file and then makes urlscan read from it. I wanted to avoid thatas I'd rather not save the contents of the buffer to the disk like that.

The change I proposed to the code shouldn't affect how urlscan works. It just checks if you are not in an interactive session it reads from stdin instead of the tty. There isn't really any negative side to having that added.

On Fri, May 28, 2021, 2:46 PM Boruch Baum @.***> wrote:

On 2021-05-28 08:58, Henrique M. D. wrote:

It is not piping it through rofi, it is literally running it through rofi. Open rofi and type the whole path to your shellscript in it. You will be able to reproduce it like that.

Ah. I missed that.

Hopefully this clears things out. Let me know if it still does not make sense.

It still does not make sense to me as a problem in urlscan. For years I've been using a different tmux plugin[1] in conjunction with urlscan, and it has always worked fine, without the error that you are reporting. See there for how it sends data from the tmux buffer to urlscan[1]. I have the plugin configured in my .tmux.conf as follows:

set -g @plugin 'tmux-plugins/tmux-urlview' set -g @urlview-extractor 'urlscan -d -w 80 -W '

[1] https://github.com/tmux-plugins/tmux-urlview

-- hkp://keys.gnupg.net CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/firecat53/urlscan/issues/115#issuecomment-850602113, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF5IA57SVLNCAQCFZ3OZJLTP7QJDANCNFSM4427QWUA .

Boruch-Baum commented 3 years ago

On 2021-05-28 12:12, Henrique M. D. wrote:

That plugin saves the content from the buffer into a file and then makes urlscan read from it. I wanted to avoid that as I'd rather not save the contents of the buffer to the disk like that. The change I proposed to the code shouldn't affect how urlscan works. It just checks if you are not in an interactive session it reads from stdin instead of the tty. There isn't really any negative side to having that added.

Likely not, although it seems odd to need to use a try/catch/error + guess idiom for something so basic and determinate. That's not my focus just now. What's on my mind is that you probably wrote your plugin not selfishly for yourself but to share as widely as possible. And for that, it needs to be flexible to the reasonable use-cases of others. The urlscan might or might not accept your PR, but if you don't make the change on your end, your plugin will be useless to anyone wanting to use it for something else. Just look at the plugin I mentioned prior. It was clearly written with urlview specifically in mind (it's called -urlview!), yet it was engineered flexible enough to handle me using it for urlscan instead. Any number of people anywhere may be silently using it with any number of other programs. If you don't make the change on your side (and the only reason you've given is that "you'd rather not") then you lock out other uses for the program, or impose upon all the others to conform to you.

-- hkp://keys.gnupg.net CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0

firecat53 commented 3 years ago

Thank you for the conversation! I promise I am following, although I have not had time to sit down and process it fully. I hopefully will have time soon!

Boruch-Baum commented 3 years ago

On a suspicion, I just now did a simple web on 'fzf tmux'. Have you tried that?

It seems that the author of fzf bundles in his repository a tmux script[1]! And there are other repositories that show up on the search page. Yours may be superior to theirs but you may want to compare and possibly collaborate with one of those. In particular, since the author of fzf seems tmux-friendly enough to have written a plugin, he may agree to bundle yours into his package, and then your work would get very wide adoption immediately.

[1] https://github.com/junegunn/fzf#fzf-tmux-script

-- hkp://keys.gnupg.net CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0

typoon commented 3 years ago

Not wanting to write to the filesystem is to prevent leaking possible sensitive data to /tmp/ that could be in the buffer. It also allows for a race condition that could lead to the file being overwritten by someone in the same machine and injecting malicious URLs in it. Writing to the user's home to prevent that will pollute their home directory. It is mostly a security concern than anything else.

The same argument you make for the plugin can be made for urlscan. As is, it does not work for non-interactive sessions, so some people (like me) end up excluded from being able to use it.

Thank you for your time. I do hope you accept the PR, but if not I guess I just have to live with it :)

Thanks!

On Fri, May 28, 2021, 4:53 PM Scott Hansen @.***> wrote:

Thank you for the conversation! I promise I am following, although I have not had time to sit down and process it fully. I hopefully will have time soon!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/firecat53/urlscan/issues/115#issuecomment-850664230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF5IA5YBPDAOR6H4K32DDDTP77DTANCNFSM4427QWUA .

Boruch-Baum commented 3 years ago

On 2021-05-28 13:59, Henrique M. D. wrote:

Not wanting to write to the filesystem is to prevent leaking possible sensitive data to /tmp/ that could be in the buffer. It also ...

All the concerns you list are admirable, but they've all been solved decades ago with standardized solutions.

Thank you for your time.

Quite welcome. I'm a potential (maybe, maybe not) future user!

I do hope you accept the PR,

Not my decision.

but if not I guess I just have to live with it :)

Even if Scott accepts your PR, do consider my advice.

-- hkp://keys.gnupg.net CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0

firecat53 commented 3 years ago

I'll go ahead and merge the PR as it seems so far to have no negative side effects. If it helps you get creative in using urlscan, I'm all for it! Thanks to the both of you for the respectful conversation and ideas thrown around. This is how open source should work!