atuinsh / atuin

✨ Magical shell history
https://atuin.sh
MIT License
20.31k stars 549 forks source link

Store command's stdout/stderr #2179

Open FlysoftBeta opened 3 months ago

FlysoftBeta commented 3 months ago

First of all, thanks for your great work!

After a few days of using, I found it would be great that we can have an option that allows users to store the output. This is very useful especially when I'm trying reproducing an error occurred previously. Also an option can be there to determine how long the history will be kept.

EDIT: FYI, if you're interested in this, please vote 👍 this post

ellie commented 3 months ago

Thanks for the kind words!

I've been thinking about this a bit lately, so it's nice to get validation it could be useful. We'd have to be careful about disk usage though!

What sort of UX would work for you around it - should output be part of search?

FlysoftBeta commented 3 months ago

Thanks for quick reply!

What sort of UX would work for you around it - should output be part of search?

edit: logs can be used to search with a keyword like log:panic;my-command

graph

ellie commented 3 months ago

Thank you for the detail there! I should have clarified though. Less concerned about how this might work in UI, more concerned about how users would interact with it and what their experience might be like

  1. If enabled, do we record everything by default? Or only when specified with something like "atuin record"/etc?
  2. Is the search part of the normal TUI (and should it then be fuzzy search and slow, or fts and fast? do users expect search over potentially gigabytes to be rapid?), or is it something separate tailored to this use case?
  3. Do outputs get synced across machines? I'd assume yes, but we'd want more configuration flexibility there
  4. How do we handle security here? There's likely to be sensitive information in stdout/err

But yeah more thinking about what sort of interactions you imagine with this, and not so much what the UI might look like

FlysoftBeta commented 3 months ago
1. If enabled, do we record everything by default? Or only when specified with something like "atuin record"/etc?

Can this be asked during installation? Or it can be included in a command like atuin setup to get user started?

We can probably provide different options for user to filter logs (e.g. all/stderr only/regex filtering/etc.)

2. Is the search part of the normal TUI (and should it then be fuzzy search and slow, or fts and fast? do users expect search over potentially gigabytes to be rapid?), or is it something separate tailored to this use case?

Yes, as I said before, it can be searched using a keyword. The search can be configured to be fuzzy or precise by user, maybe.

GitHub's search is a good example: ![Alt](https://github.com/atuinsh/atuin/assets/49718840/285ef30b-13c1-4462-8e1f-d041fa9f6f63)
3. Do outputs get synced across machines? I'd assume yes, but we'd want more configuration flexibility there

Yes, I should be configurable.

4. How do we handle security here? There's likely to be sensitive information in stdout/err

We can store encrypted data locally and store secrets in OS-provided APIs like (KDE Wallet).

For data sync, the data should be encrypted like we have previously done for shell history.

ellie commented 3 months ago

The search can be configured to be fuzzy or precise by user, maybe.

I think the issue here is that fuzzy search is actually quite slow, and is only fast because Atuin isn't really searching very much data. As soon as we're searching over a much larger index, it will crawl. The only viable solution there is a proper search index. SQLite can do this, however I've found it's not very good for searching shell commands (though I keep trying to configure it to do so, it never quite feels right).

I think the only viable option at the moment is for it to be separate to shell history search

We can store encrypted data locally and store secrets in OS-provided APIs like (KDE Wallet).

This would need to be decrypted somewhere for search to work - so the concerns around security remain, as the output can't be sat encrypted. Will need to research if a decrypted inverted index but encrypted document store is acceptable.

The biggest blocker for this imo is that we currently have no way of syncing large blobs. Atuin syncs small blobs, larger ones would need a different storage method. Really this would be an object store, and a pointer to that in the database.

This would definitely be a cool feature, and the above problems can be solved, but I'd like to gauge demand before anyone considers implementing it. It's a very nontrivial implementation effort, with a lot of complexity involved - both client and server side. It would also make self hosting harder.

If you're interested in this, please 👍 the first post

FlysoftBeta commented 3 months ago

The biggest blocker for this imo is that we currently have no way of syncing large blobs. Atuin syncs small blobs, larger ones would need a different storage method. Really this would be an object store, and a pointer to that in the database.

IMO, usage of output sync isn't very common. So can it be a separate feature to be implemented in the future? It can also be a big pressure for servers (the cost won't be low! LOL)

ellie commented 3 months ago

I think if implemented output sync would be a pretty common need/ask for users, so we should consider it as part of the implementation load. Either way, let's see what demand is like :)

johnny-mayo commented 1 month ago

I barely just installed Atuin, and found myself here because, yes, I really want to be able to see the output of commands, more than just the exit code.

Sometimes I try variants of commands that are successful, like various regex in sed, and I realize that I broke something a few steps back, and I would like to see the outputs...just one "for instance" of many. Sure, I can scroll up in the terminal if it is something recent, but it is not always recent...and just one "for instance" of many. :)

1) If enabled, do we record everything by default? Or only when specified with something like "atuin record"/etc?

Installation question and/or "atuin setup" would be great, and "atuin record" only on top of setting the default. I would want to have it on constantly, always, permanently, and redundantly. :)

2) Is the search part of the normal TUI (and should it then be fuzzy search and slow, or fts and fast? do users expect search over potentially gigabytes to be rapid?), or is it something separate tailored to this use case?

The search part should stay as it is, unless and until some magic key is pressed to tell Atuin that you want to include output in the search, or, a keyword like "log:panic;my-command" (as mentioned by FlysoftBeta above). Oh, and I vote for "fts and fast".

3) Do outputs get synced across machines? I'd assume yes, but we'd want more configuration flexibility there

Options. Later?

4) How do we handle security here? There's likely to be sensitive information in stdout/err

Pass the output through regex filters before it goes to file or is sent to the db...or, encrypt the database and/or output files in a compressed, auto-resizing, encrypted volume that mounts on boot with pam_mount when the user logs in (see below). The entire encrypted volume can be sent to the cloud for a full backup, even outside of syncing between machines.

From a UX/TUI perspective, it would be great if I could scroll through commands matching my search criteria (as Atuin does now), hit some magic key on my keyboard (which I may customize), and it would split the screen between the commands I am still scrolling through, and the output for each command would appear in the section (frame?) below, still allowing me to scroll through commands, and dynamically displaying the output of each command below, and, when I hit another magic key (tab, caps lock?), my cursor would jump from the command issued to its output, so that I can scroll through it's output if it is longer than can be shown on the screen, then maybe ESC or something to make the output pane (section, frame...thingie!) go away and I'm back to scrolling through matching commands.

As far as how the command output is stored, I have seen research done that suggests the performance cutoff between storing a long string of text in a database (SQLite, but Microsoft had very similar results with their SQL server), is about 4k...the thinking is that you are querying the database (opening one file) instead of opening each individual file (the output of each command), so less filesystem overhead...or you could use a combo solution, anything less than 4k is in the db, anything larger that 4k is a file (in an object store) on the disk, so you could query the db for the shorter outputs AND spawn a subshell (to use another cpu core) to grep (or rust equivalent) to search through the output/log files greater than 4k and return the results to Atuin to improve performance by splitting the work. So that the main command search function doesn't get slowed down, the command outputs can be stored in a separate database, which might also help with parallelizing the search?...but if you search for "log:panic;my-command" and my-command isn't found, the log/output needn't be searched...something that I would think would be the default behavior with a database, you would have to handle the condition yourself.

I know planning ahead is important, but maybe this feature could be developed incrementally? First, store the output, then make it searchable, then sync options? I'll have to see what you are doing for sync...sometime...

I looked for an auto-resizable compressed encrypted volumes that could be mounted on login, where the command output files (and/or database) could be stored, and eCryptfs, "The enterprise cryptographic filesystem for Linux", which is supported by the kernel, and is mountable on login (pam_mount), and compression, and auto-resize, and, obviously, crypto, seems to check all the boxes.

Again, I just installed Atuin, and I don't know if the database is encrypted already or if only the sync is encrypted, and if the db is encrypted why it is important to regex filter things like passwords from the commands...I use Qubes-OS, which uses FDE and Xen, for anything secure, everything else on prem is just a dev playpen.

CryFS seems to check all the boxes, and is built for the cloud...but power loss can destroy the entire encrypted volume, and it has no recovery tools. gocryptfs doesn't support compression, VeriCrypt and FUSE don't auto-resize, KDE Wallet is not mountable, searchable, compressed, and ZFS...well...yea...anyway...and nothing else seems to come close to the requirements.

I tried HSTR, but it was disappointing. Atuin is what I'll be using now. Love it.

Thank you,

johnny-mayo