Use the '\0' character as file name delimiter in $fx

gokcehan / lf

Terminal file manager

MIT License

7.79k stars 331 forks source link

Use the '\0' character as file name delimiter in $fx #47

Closed dumblob closed 4 years ago

dumblob commented 8 years ago

I'm having huge issues with any delimiter other than '\0', because I work with many weirdly named files. Currently the $fx variable contains file names separated by ':', which I can't unfortunately use. Could you please add a set option to use '\0' instead of ':' in $fx?

gokcehan commented 8 years ago

I think we can add an option for this. On the top of my head, we need to think about a few consequences:

There should be a way to set the value of \0 to an option in the scanner (e.g. set filesep $'\0')
We should do the same change for the communication protocol or otherwise it wouldn't be possible to copy/paste files

gokcehan commented 7 years ago

I have now added octal escapes in double quoted strings. In theory null delimiter should have been possible with set filesep "\0" but it doesn't seem to be working or I was just testing it wrong. Other characters seem to be working fine (e.g. set filesep "\101"). I suspect the issue might be about os.Setenv call which may be confusing the null character in string with null termination. This needs further investigation.

On a related note, I have been reading this article about filenames:

https://www.dwheeler.com/essays/fixing-unix-linux-filenames.html

I realized our default colon (:) as separator has a few problems with it. First, it is usually a safe choice for path names since it is used as a path separator in many environment variables but the same is not true for filenames. I see a lot of pdf files such as books which has colon character in the title and so in the filename. Second, it is still a problem on Windows as it is the volume separator. Article mentions it is a also problem on MacOS but I'm not sure why.

So I was thinking, now that we have double quotes implemented, we can change the default file separator to the newline character (\n). Newline characters are allowed in filenames but I have never seen them used except for unintentionally. Currently communication protocol file separator is not configurable. At some point we were using newlines but I had changed it to colon for consistency. I guess we can change it back to newline for consistency again. Then it shouldn't matter much that it is not configurable.

gokcehan commented 7 years ago

I have tested the following code:

package main

import (
    "fmt"
    "os"
)

func main() {
    os.Setenv("x", "hello")
    x := os.Getenv("x")
    fmt.Printf("x = %s\n", x)
}

and it prints x = hello as expected. If I change the line to os.Setenv("x", "hello\000world") than it prints x =. I searched a little bit about it looks like it is not possible to store null characters in shell variables so it is not possible to use null byte as the file separator after all.

lenormf commented 7 years ago

Storing NUL is forbidden by the standard, so it's good your shell didn't allow it when you tried it!

These strings have the form name=value; names shall not contain the character '='. For values to be portable across systems conforming to POSIX.1-2008, the value shall be composed of characters from the portable character set (except NUL and as indicated below).

dumblob commented 7 years ago

I searched a little bit about it looks like it is not possible to store null characters in shell variables so it is not possible to use null byte as the file separator after all.

This doesn't seem to be a valid derivation. Currently the file separator serves at least 2 purposes:

To separate file names in lf (and its input & output).
To use the filesep value as a value of a shell variable.

The first purpose shall allow treatment of \0 as a separator without any issues.

The second one is though in a "raw" form not possible. The question is though, why couldn't we pass the semantics of a file separator e.g. in some symbolic form - think of e.g. some "encoding scheme" like UTF-8 or base64 or whatever you like. Basically the only requirement is it must be quite easily parsable in a POSIX shell. E.g. the wc utility returns number of characters, not bytes, so we can actually easily distinguish if it's just one UTF-8 character or more characters in a row. This might allow using special words like null to designate that the \0 character is used in lf outputs and at the same time that lf expects \0-terminated input.

And there is another possible purpose of the filesep option - the communication protocol file separator, which should be bullet proof in the first place. I.e. use the null character and not a newline.

gokcehan commented 7 years ago

@dumblob I don't understand what you mean by the first purpose and also the suggestion about encoding. Can you elaborate and give examples both as golang and shell codes?

dumblob commented 7 years ago

@gokcehan imagine lf does not support filesep nor anything similar. Internally lf would in that case use go strings for file names (i.e. no concerns with \0) and for the IPC protocol either TLV (or just "LV") or \0 as the separation character. Now, if we wanted to extend this behavior to be user-exposed (and thus also user-defined) because of the need of lf-user_script interaction, we need to allow the user to express her will how her user_script expects the filenames to be separated and how should the by the user_script produced file name lists be separated.

And this expression of will can be accomplished using any means. We actually do not even have to obey the POSIX sh limitation of variable content, because we can pass everything in pipes (which do not have this limitation) or we can use sh variables, but \0s in their content will be encoded or we can use any encoding which is well-processable using POSIX sh (TLV is an option, even though it would be a bit more work; a better multiplatform solution might be to temporarily open a FIFO named pipe or even a dedicated socket, and pass the pipe/socket name/handler/file_descriptor in an environment variable to the user_script; there are of course other options ...).

I would prefer lf sending a list of zero-terminated file names directly to the standard input of the user_script with the first record (the first N bytes until the first \0) not being a file name, but serving as a header being preserved for future use (might be empty now, but must be there). This allows easy tail -z +1 | ... processing (-z is not POSIX, but GNU and BSD) or xargs -d '\0' my_app (-d is not POSIX, but GNU and BSD) or even tr '\0' '\n' | ... for those being sure there is no filename with the LF character in it (the "POSIX" way of doing things, sadly). I would make this default and maybe rather not user-configurable (it's easy to handle it on most systems including Windows, BSD and GNU-utilities based systems like majority of Linux distributions). Asynchronous execution of scripts (which would otherwise block lf) could use buffered output and once the buffer is full, just run lf until the buffer is not empty. Or it could be solved by passing a named pipe identifier/name in a variable to the user_script and not putting anything to stdout, but rather letting user_script call lf -remote ... to read the stream of zero-terminated records (file names) at some point later.

gokcehan commented 7 years ago

@dumblob You suggest several alternatives and argue that they are simple but I can't think of easy ways to do even the simplest things. I specifically asked for examples for this reason. For example currently I can move selected files to a new folder with something like $mkdir asd; mv $fs asd. What is the equivalent of this command in your suggestions?

We can use null character in the communication protocol but what would be the point? We use a custom protocol instead of a standard one like JSON-RPC so that it would be easy to type. Filenames with newlines are already not handled properly in other places. For example they are shown as a single line in the ui and I can't think of a simple alternative for this. So maybe it is best to assume that filenames will not have newline characters and design the rest for simplicity. People have been writing shell scripts with this assumption for quite some time.

dumblob commented 7 years ago

$mkdir asd; mv $fs asd

Hm, I doubt anyone would do this as it won't work for files with IFS in their name (and actually lots of files have e.g. spaces in their names).

Disregarding the note above, the solution with \0 could look like this on GNU/BSD systems (-0 is not POSIX): $mkdir asd && lf -remote 'read marked' | xargs -0 -I my_sub mv my_sub abc/ or like this on POSIX systems: $mkdir asd && lf -remote 'read marked' | tr '\0' '\n' | xargs -I fname mv fname abc/ (it's unsafe and without any checks).

Filenames with newlines are already not handled properly in other places.

What are all these other places (except for those with visual output)? We should really fix them.

For example they are shown as a single line in the ui and I can't think of a simple alternative for this.

That's easy to fix. We don't need the exact file name, but just a good enough visual representation of "a pointer to file". Plain \n (and other unprintable characters) substitution (at least for ? at the beginning and maybe vim-style like <LF> in the future) seems to be the way to go (in ranger it's unfortunately not yet fully solved: https://github.com/ranger/ranger/issues/498 ).

So maybe it is best to assume that filenames will not have newline characters and design the rest for simplicity. People have been writing shell scripts with this assumption for quite some time.

I'm getting scared. We shouldn't assume anything. We should make sure lf won't cause harm even in less-frequent cases. Either avoid these cases completely or solve them in a bullet-proof way. I'm confident everybody who already lost some data because of wrong file manipulation (disregarding backups) won't use a file manager, which just "assumes" things.

gokcehan commented 7 years ago

$mkdir asd; mv $fs asd

Hm, I doubt anyone would do this as it won't work for files with IFS in their name (and actually lots of files have e.g. spaces in their names).

If you have set ifs variable in your config file then you won't have to explicitly set IFS each time in your commands which is what I considered here. By the way, your all suggestions so far disregard the use of IFS variable which is what the current method is designed for.

Disregarding the note above, the solution with \0 could look like this on GNU/BSD systems (-0 is not POSIX): $mkdir asd && lf -remote 'read marked' | xargs -0 -I my_sub mv my_sub abc/ or like this on POSIX systems: $mkdir asd && lf -remote 'read marked' | tr '\0' '\n' | xargs -I fname mv fname abc/ (it's unsafe and without any checks).

I think this is yet another alternative than those you previously suggested because I can not infer this from your previous replies. I'm guessing you did not want to directly handle stdin in user scripts which would be very difficult if not impossible and decided to use remote commands for this purpose. I assume 'read marked' is a new remote command you came up with which is currently not possible since each client have its own set of marked files and these are stored on the client side. And your posix version still does not handle newlines properly with a much more complicated command.

Filenames with newlines are already not handled properly in other places.

What are all these other places (except for those with visual output)? We should really fix them.

Visual output and also the current IPC protocol does not handle newlines properly. These are the two places that I know of but there might be others as well because I haven't been testing with files having newline characters up until this issue came up.

For example they are shown as a single line in the ui and I can't think of a simple alternative for this.

That's easy to fix. We don't need the exact file name, but just a good enough visual representation of "a pointer to file". Plain \n (and other unprintable characters) substitution (at least for ? at the beginning and maybe vim-style like <LF> in the future) seems to be the way to go (in ranger it's unfortunately not yet fully solved: https://github.com/ranger/ranger/issues/498 ).

If you show newlines with \n then a file with newline character will be represented same as a file with two individual characters \ and n which are all allowed in filenames. This is true for all your other suggestions. Filenames can have the ? character which are much more common than the newline character, or < and > characters either. So basically your suggestion is to handle newline characters but create other corner cases in the meantime.

So maybe it is best to assume that filenames will not have newline characters and design the rest for simplicity. People have been writing shell scripts with this assumption for quite some time.

I'm getting scared. We shouldn't assume anything. We should make sure lf won't cause harm even in less-frequent cases. Either avoid these cases completely or solve them in a bullet-proof way. I'm confident everybody who already lost some data because of wrong file manipulation (disregarding backups) won't use a file manager, which just "assumes" things.

This is irrational for several reasons. First, you're mixing up common pitfalls (e.g. ' ' in filenames) with corner cases (e.g. '\n' in filenames). In my nearly 10 years of linux/unix use, I have never seen a file with a newline character. Second, there are many applications which does not handle newlines already and they are not documented either. You have most likely used one of these programs before without a problem. Most shell scripts have this assumptions as well and I don't think you check each script source for this corner case before you execute it. And even most commands you're likely using daily assume one of the corner cases is not possible. I find that hard to believe that you have never typed something like rm * in your shell which does not handle leading dashes in filenames properly. Or even a simple ls which can print control characters and bork your terminal if not worse since these are also allowed in filenames.

It is not that I don't want to handle this issue, but I don't see a "bullet-proof" solution, at least with posix compliance. The article I have linked before basically argues that this is an open problem and each method assumes something and newline assumption looks like the best trade-off here. If you're worried about this, you can check your filesystem to see if you have a filename with a newline character or not.

dumblob commented 7 years ago

I think this is yet another alternative than those you previously suggested because I can not infer this from your previous replies.

Actually it's the thing I proposed. I might have described it clumsily though :wink: (refer to

I would prefer lf sending a list of zero-terminated file names directly to the standard input of the user_script

and

Or it could be solved by passing a named pipe identifier/name in a variable to the user_script and not putting anything to stdout, but rather letting user_script call lf -remote ... to read the stream of zero-terminated records (file names) at some point later.

above).

I just skipped the "first line" requirement in the example as it's not needed if we use lf -remote operations which designate beforehand what's exactly in the output and that the output is to be found e.g. on stdin (so no need to pass any named pipe identifier/name in a variable etc.). The shown code is undoubtedly bullet-proof, POSIX compatible, fast, and very easy to use.

Visual output and also the current IPC protocol does not handle newlines properly. These are the two places that I know of but there might be others as well because I haven't been testing with files having newline characters up until this issue came up.

Good to know, thanks! This shouldn't be difficult to fix - for visual output it's just one string filter function and for IPC we have plenty of encoding options, but I would probably stick with simple escaping (that is dead-easy to do with tr).

If you show newlines with \n then a file with newline character will be represented same as a file with two individual characters \ and n which are all allowed in filenames. This is true for all your other suggestions. Filenames can have the ? character which are much more common than the newline character, or < and > characters either. So basically your suggestion is to handle newline characters but create other corner cases in the meantime.

Of course. The point is, that we show the semantic information to the user as that's exactly what one expects from a file manager (i.e. for usual file names there won't be any change; for special file names though, one should know there are some specialities, which are well handled, rather than to see some garbage as UTF-8 allows characters which "erase" or overlap previous characters, even on previous lines, etc.). We might also use colors or underscore/bold/italic/strikethrough/... (on VT100 terminals) in addition to special characters.

This is irrational for several reasons. First, you're mixing up common pitfalls (e.g. ' ' in filenames) with corner cases (e.g. '\n' in filenames). In my nearly 10 years of linux/unix use, I have never seen a file with a newline character. Second, there are many applications which does not handle newlines already and they are not documented either. You have most likely used one of these programs before without a problem. Most shell scripts have this assumptions as well and I don't think you check each script source for this corner case before you execute it. And even most commands you're likely using daily assume one of the corner cases is not possible. I find that hard to believe that you have never typed something like rm * in your shell which does not handle leading dashes in filenames properly. Or even a simple ls which can print control characters and bork your terminal if not worse since these are also allowed in filenames.

Actually I experienced exactly the issues you describe. And the frequency was so high and the results so severe, that I wrote my very own rm several years ago which is completely safe (it's a POSIX shell wrapper for /bin/rm which sanitizes input and also acts as a circular trash buffer; this wrapper is pretty big - 133 SLOC - and full of quirks to overcome POSIX limitations). I did similar thing with mv etc. Since then I really didn't have any (I swear) file-management related issue in terminal with my .profile loaded.

The argument here is though, that lf standalone wouldn't have issues with unconventional file names, but once lf offers interaction with the surrounding environment (through scripts and pretty recently also through lf -remote), it's so easy to write an incorrect code which will feed lf with crappy file names even though no such files will exist on the system.

It is not that I don't want to handle this issue, but I don't see a "bullet-proof" solution, at least with posix compliance.

In this thread we demonstrated it's easily possible to have a bullet-proof solution on vast majority of systems and a standard solution for POSIX systems (while being just 14 characters longer: | tr '\0' '\n'). So I would say there is no need for trade-offs. Or am I still missing something (except for the tedious work to fix the current code based around the newline and similar assumptions)?

lenormf commented 7 years ago

I wrote my very own rm several years ago which is completely safe

Do you have that versioned somewhere?

In this thread we demonstrated it's easily possible to have a bullet-proof solution on vast majority of systems and a standard solution for POSIX systems (while being just 14 characters longer: | tr '\0' '\n')

This wouldn't handle files with newline characters though, which lf currently doesn't support anyway.

Seems like using newline characters as separator is the easiest and good enough step to take now, until a clear full proof solution can be designed (which seems to lean toward using NUL as a separator, but even if that was implemented how to handle such files list in a POSIX fashion?).

dumblob commented 7 years ago

Do you have that versioned somewhere?

Not now (it's part of my 1600 lines long .profile and not decoupled), but I'll do it. Stay tuned :wink:.

This wouldn't handle files with newline characters though

Of course. But it makes it explicit, which is what we want - if one wants to sacrifice safety, it needs to be him, not lf.

Seems like using newline characters as separator is the easiest and good enough step to take now, until a clear full proof solution can be designed (which seems to lean toward using NUL as a separator, but even if that was implemented how to handle such files list in a POSIX fashion?).

This is also easy. If you don't like the "do the mistake yourself" option (| tr '\0' '\n'), then we must stick with the solution which Neovim, Dao and other highly multiplatform projects utilize. Namely they offer the missing functionality themself. In case of lf this would mean introduction of lf -cmd allowing lf -cmd read0 my_var which will read stdin until the first \0 and will put the read data into my_var env variable.

drwilly commented 5 years ago

I would like to suggest a different approach to the filename problem: Currently scripts receive lf's selection via env-vars and the commandline via the script's argv. My idea is to reverse this: pass lf's selection via the script's argv and the commandline via an env-var such as $LF_CMDLINE. Passing filenames via argv is safe, while the commandline is user input, that the shell is well equipped to handle. I suggest the convention f="$1" and fs="$(shift; echo "$@")", even though passing f="$0" and fs="$@" would also be possible.

An advantage this would have aside from safety, is that currently commands such as :rename require quoting: Running :rename My File.txt will currently result in a file called My while the File.txt part is quietly ignored. To get to the desired result you have to enter :rename "My File.txt". While the latter is in line with shell-behaviour I would argue that it is better to leave it up to the script how it handles its input.

dumblob commented 5 years ago

@drwilly sounds like an interesting idea to me - I'm also kind of tempted to agree with the statement leave it up to the script how it handles its input

gokcehan commented 5 years ago

@drwilly Thank you for the suggestion. I remember someone suggesting passing selection with argv but then custom commands would not be able to take arguments. We then never considered swapping selection and arguments as you suggested. It seems like an interesting idea.

My current stance on this issue is to only adopt a solution if it does not make things more difficult for regular cases. For example something like $vim $fx should work as expected. What is the equivalent of this command in your suggestion? Also can you clarify what you mean by convention? Do we need to manually add those lines to every command or do you suggest we should add them to every command automatically? But then do we not have the same issue when we actually want to use these variables in a command?

In the rename example, arguments are parsed using the lf configuration syntax, so you will always need to quote it, otherwise they will be parsed as separate tokens and passed to the underlying command as separate arguments. Currently we consider adding a builtin rename command which should make it possible to get rid of quotes but it is a different issue.

drwilly commented 5 years ago

@gokcehan Initially my suggestion was to remove $f $fs and $fx in favour of the argv entirely, but thinking about it, only $fs (and by extension $fx) are problematic when it comes to tricky filenames. By convention I just mean this: what is the contents of the argv. You could say the argv contains $fx, then the question is: Is there a use-case where a script wants to access $f AND $fs? Or you could say the argv contains $f and $fs, then the scripts would require the logic that splits $f and $fs from the argv. So the response for "What would the current $vim $fx look like then" depends on what we put in the argv. In the first example it would just become $vim $@.

gokcehan commented 5 years ago

@drwilly I think in the second case it is more like a use case where a script wants to access $f OR $fs. For example a rename command only kind of makes sense for the current file (i.e. $f). If we settle on the first convention (i.e. argv contains only $fx), then such a rename command would not work properly when there are selected files without dropping them first. With the second convention (i.e. argv contains $f and then $fs), other regular commands can get significantly more complicated. For example the current $vim $fx would become something like $[ $# -eq 1 ] && vim "$1" || (shift; vim "$@"). I think the second convention is much worse than the first so I will only mention the first convention for now.

As I mention above, one of the disadvantages of the first convention is to lose the ability to use $f and $fs in commands. Other than that, you always need to quote your arguments. As far as I understand, $@ without quotes is no different than our current IFS method. On the other hand, I also feel like this proposition does not solve the issue but only move it to somewhere else. When we swap argv and arguments, we are going to need to pass the arguments as environmental variables and then there will be people complaining about the lack of ability to use special characters in arguments.

drwilly commented 5 years ago

@gokcehan I agree on the $@ = $fx argument. For the rest I don't think that is a fair assessment of the situation. First of all if you look at it from the user's perspective you know if you want $f or $fs. There is no need to write $fx into an interactive prompt, ever. $fx is useful as a shorthand in a non-interactive environment, but not a critical feature. Second, if $vim $fx is such a frequently used command for you, I would think you'd create an :edit, :e or :vim command, thus eliminating the need to type in any variables in the first place. Lastly, if you haven't created such a command you still have access to an interactive prompt with TAB completion, thus $someprogram My\ Needlessly\ Long\ File\ Name.txt is not as big a problem as it might seem. This last point currently leads me to believe that $@ = $fs might be the most sensible way to go.

As for the 'moving the problem somewhere else' argument, I don't follow. Like what kind of command would a user want to input into his command line prompt that is no longer possible?

gokcehan commented 5 years ago

@drwilly Since you haven't addressed the extra requirement of quotes for $@ and you already regard quoting requirement as an issue for our rename command previously, I assume you already acknowledged that it is a disadvantage of this approach. I should also note that missing quotes is one of most common errors in shell programming so this new approach may not actually be safer than the old one in practice.

You now further divided the commands into two use cases as interactive as non-interactive so I should also divide my arguments as such. For interactive use cases, I agree that you already know if you want $f or $fs beforehand, but you still need to write the respective part of the conditional to run it. If you think that $vim My\ Needlessly\ Long\ File\ Name.txt and $shift; vim "$@" is not more difficult than $vim $f and $vim $fs, then I have no further arguments to convince you otherwise.

For non-interactive commands, I assume you agree that $[ $# -eq 1 ] && vim "$1" || (shift; vim "$@") is more difficult than vim $fx but you do not regard this as an issue since this is non-interactive. If you do not acknowledge this as a disadvantage, again I have no further arguments to convince you otherwise.

In my examples, vim is just an example and you can consider it as someprogram.

For the last case, let's say you want to define a mkdir command that would take names of directories as arguments to create them. Currently you can define such a command as follows:

cmd mkdir $mkdir "$@"

You can then call this command with arguments including any IFS characters as such:

:rename "foo\nbar" "baz"

This would create two directories named foo\nbar and baz respectively.

If we pass arguments as an environment variable, the same command would then look something like the following:

set ifs "\n"
set argsep "\n"  # assume this is a new option like 'filesep' but for arguments
cmd mkdir $mkdir $args

Now this command can not handle arguments with IFS characters and when this command is called as previously it will create three directories instead of two (i.e. foo, bar, baz). This is basically the same equivalent issue for arguments and we will surely have someone complaining about the lack of ability to safely pass IFS characters in arguments.

drwilly commented 5 years ago

@gokcehan I value convenience in interactive environments and correctness in non-interactive environments. Incorrectly quoted scripts are just that: incorrectly quoted. On the other hand, IFS=\n and usage of $@ are not mutually exclusive. If a user wants to set IFS=\n and use $@ in an interactive environment, sure. Still, I would be much happier if everything that is part of lf's distribution was properly quoted.

I do agree that $vim $f is more convenient than $vim My\ Needlessly\ Long\ File\ Name.txt however only marginally so, due to the existence of TAB-completion in the prompt. I don't agree that $vim $fs is more convenient than $vim $@. (Again, I think $@ = $fs and keeping $f might be the best, since only lists of files are problematic in a single variable. $f is ok because it is only a single file) I do agree that $fx is more convenient than [ $# -gt 0 ] && vim "$@" || vim "$f". It is simply a price I am willing to pay.

As for the mkdir/rename example, again, I value correctness in non-interactive environments (see above) and convenience in interactive usage. Thus I would want my :mkdir to not require quotes. This means that :mkdir "foo\nbar" "baz" would create ONE directory called "foo\nbar" "baz" (jesus, who would do that). The corresponding script would simply be mkdir -- "$args".

To actually address your point, yes that is again problematic. However you CAN work around:

args='"foo bar" "baz"'
eval "printf '<%s>' $args"

will (correctly) output <foo bar><baz>. (You can do the same with properly quoted '\n', it just looks uglier on github). And yes, putting things into eval makes the script uglier. But the interactive usage is convenient and the non-interactive usage is correct.

dumblob commented 4 years ago

Sorry for being inactive for so long. After reading the discussion I have to 100% agree with @drwilly . I think his proposal is the better successor to my proposal(s). Any chance giving it another chance?

I didn't use lf during the last 2 years, but now when I installed the newest revision, I totally fell in love with it. It feels really mature (!) and actually this is the only major issue I have with it. In the worst case, I'd even prefer just setting a variable unsafe=1 in case lf would detect ifs (the ifs of lf, not the IFS of sh) in any of the selected names when invoking any command to allow me to do [ "$unsafe" -eq 1 ] && lf -remote "send $id echoerr 'ERR some file/dir name contains lf ifs, so lf cannot handle them safely'".

sharethewisdom commented 4 years ago

I had prepared a very similar issue offline, and I did not fully read and understand the above. But if you'd be so kind, could you please clarify why this, for example, does not split on newlines and quote the arguments?

set shell zsh
set filesep '\n'
cmd open !printf '%s\n' ${(qf)fx}

While this is properly quoted?

set shell zsh
set filesep '\n'
cmd open !str=`echo -n $fx`; printf '%s\n' ${(qf)str}

I wanted to use the null character as well. Similarly, this is properly quoted:

set shell zsh
set filesep '\0'
cmd open !str=`echo -n $fx`; printf '%s\n' ${(q0)str}

Reasoning for why '\0' support may be important can be found in man find(1):

This allows file names that contain newlines or other types of white space to be correctly interpreted by programs that process the find output.

(once again, I'm sorry I can't currently be online long enough to read it all)

dumblob commented 4 years ago

@sharethewisdom didn't analyze your code, but do you know, that any (POSIX) shell variable is forbidden to contain a value having \0 in it? Therefore you can't use \0 as IFS. Second, do you know, that any (POSIX) subshell (be it `` or $()) trims all trailing newlines (and maybe other [:space:] characters)?

dumblob commented 4 years ago

@gokcehan if the solution @drwilly outlined is not fitting, could the less "controversial", easy to maintain and very tiny to implement solution introducing lf -cmd read0 my_var which would read stdin until the first \0 and would put the read data into my_var env variable as outlined above be acceptable?

alexherbo2 commented 4 years ago

How about hydrating arguments with:

eval "set -- $lf_quoted_fx"

dumblob commented 4 years ago

@alexherbo2 would be OK for me for now. Though for cross-platform functionality it seems it would need underlying shell detection which doesn't sound like fun...

alexherbo2 commented 4 years ago

@dumblob Maybe more explicit names and formats, such as $lf_shell_quoted_fx and $lf_json_quoted_fx.

dumblob commented 4 years ago

@alexherbo2 yep, that sounds totally plausible and I'd be all for that.

dumblob commented 2 years ago

Another years passed and this is still unsolved as of today.

I am now convinced we must keep things ($f $fs $fx) as they are and only add the filling of $1 $2 $3 ... with non-preprocessed (i.e. without quoting or whatsoever) items from which $fs was constructed. Yes, it is redundant but offers 100 guarantees for virtually no price in lf (neither code-base-wise nor performance-wise).

Anyone wanting to make a PR?

drwilly commented 2 years ago

Hi @dumblob , I'm still getting notified because I commented on the issue years ago, but I've long since moved on from using lf. I don't think it's useful to try to push against upstream's resistance and I have no intention on maintaining a fork when alternatives like ranger exist.

dumblob commented 2 years ago

I haven't used lf for ~2 years (neither *nix file manager as I did not need them until now :wink:).

But one thing is sure - situation and people's beliefs and opinions change over time. So I am again asking here what the current stance would be.