Closed TNTBOMBOM closed 3 months ago
Indeed R4.2 introduces stricter filename validation, it needs to be (among other things) proper UTF-8 string. Maybe you use some different encoding for your file names?
Maybe you use some different encoding for your file names?
I was using Debian 11 as a standalone backup, which I had shifted from Qubes 4.1 using Qubes backup. After restoring my backup, I now want to create a new standalone based on Debian 12. I successfully created the new Debian 12 standalone and attempted to transfer my files to it. However, I encountered an issue where only files with non-English characters became stuck during the transfer process.
Workaround: you can change the file names to English before transferring them. Once the files have reached the destination, you can revert the changes by renaming them back to their original names.
Workaround: you can change the file names to English before transferring them. Once the files have reached the destination, you can revert the changes by renaming them back to their original names.
I recommend archiving them all into a single archive file so that you can keep the original names in the destination.
B
Workaround: you can change the file names to English before transferring them. Once the files have reached the destination, you can revert the changes by renaming them back to their original names.
I recommend archiving them all into a single archive file so that you can keep the original names in the destination.
B
Note that this loses the protection provided by qfile-unpacker
’s filtering.
The error message (“invalid or incomplete multibyte or wide character”) is horrible, not least because it provides no information as to which character was disallowed. At a minimum, there should be separate errors for “name is not valid UTF-8” and “name contains a forbidden UTF-8 character”, and the latter must state explicitly which character was forbidden.
I recommend archiving them all into a single archive file so that you can keep the original names in the destination.
Yes, and it destroys the whole concept of security improvements that qvm-copy
now should provide. The whole strict policy change makes no sense in this case, only affects user experience.
I personally think, that not being able to copy symlinks and files with non-utf-8 chars in name anymore is a major design mistake. The user data should be preserved at all costs. Ext2/3/4 allows non-utf8 chars in path and that is how it is. Only '\0' is not allowed, it is the only exception according to FS specs.
@marmarek @DemiMarie
What about adding some flags to qvm-copy
to make it work as it was prior to R4.2, as expected by all current users?
Is it possible to add it before R4.2 release, that will break users experience and force users to pack everything in tar (like @brendanhoar suggested and I will also have to do) just due to the fear that qvm-copy
will silently fail with the copy process, skip something or break intentionally?
Moving my messages here:
Let's consider a directory with 100 000 files. It has 1 absolute symlink somewhere in the depth, 1 relative symlink targeting above this directory, and several files that are fine except their filenames have byte sequences that cannot be interpreted as utf-8 string.
User wants to copy it to the other qube (maybe much less secure) the same way as EVERY other copy tool works: like cp
and including qvm-copy
that existed for many years up to R4.1.
What should the user do to achieve it with the new R4.2 approach?
There are 2 cases: from terminal and from GUI (Nautilus or Dolphin). What the user should do in both cases to copy files and avoid user's data loss?
Would it make sense to have a "qvm-copy --insecure" tool/option (including in the file manager)?
I think it definitely would!
I understand what you are saying, I just think consequences can be worse, than what's gained, at least in my use cases. Because qvm-copy
is already "secure", and the filtering is like adding additional layer. Like: why not check copied files with antivirus, or block copying shell/ELF files or something, it all looks as the same approach that copy tool should do to be called secure.
For me secure is more like:
safe
.The R4.1 meets these criteria completely.
My main point: After these changes the user will never be sure if Qubes OS copy tool will work as they expect:
Best to model the Linux coreutils cp
command?
@marmarta what do you think? I don’t know enough about UX to be able to make an informed statement here.
@jamke So the problem with symlinks is:
echo > ~/QubesIncoming/x/a
# oops, overwrote ~/.bashrc (or some other important file)
cat ~/QubesIncoming/x/b
# oops, just read e.g. ~/.ssh/id_ed25519 (or some other secret key)
As far as I can tell, symlinks are off-topic in this issue.
@DemiMarie
@jamke So the problem with symlinks is
It is not a problem, it is what symlinks are. It is the same in any GNU/Linux distro.
Why not also fall with error on copying shell scripts? ELF binary files? Or exe files (they run on double-click in wine/mono even without x
permission!)? Why not add Antivirus checks on copying?
I have an answer: because all this is not required from secure and reliable copy tool, even if it adds some additional security, the same goes for current breaking and unasked changes.
And what is the attack scenario that can be exploited this way (forcing user to run such commands)? User blindly running some commands from the internet? And who can copy these malicious symlinks from one qube to another in the first place? Only user can do it, selecting target qube manually, not some malicious application.
I can give up convincing that all this changes (breaking symlinks, failing on valid ext4 filenames) are a mistake that brings more problems to users than security (it also makes the qvm-copy tool and scripts that depend on it less reliable).
Just, please, make it possible to run qrexec-client-vm
with some option which will make it copy files as expected: binary perfect without unasked checks and errors. As cp
, qvm-copy
in R4.1 and any other copy tool does.
@jamke This would need to be a separate service (qubes.FileCopyUnsafe
?) so that the prompt presented to the user indicates that this is a more dangerous action. @marmarta thoughts on the name?
This would need to be a separate service (
qubes.FileCopyUnsafe
?) so that the prompt presented to the user indicates that this is a more dangerous action.
@DemiMarie Sounds interesting, I like the idea.
This would need to be a separate service (
qubes.FileCopyUnsafe
?) so that the prompt presented to the user indicates that this is a more dangerous action.@DemiMarie Sounds interesting, I like the idea.
I think this is something that could be considered. This would still preserve less metadata than tar
or cp -a
, because the latter two preserve e.g. file ownership, SELinux contexts, and possibly hard links as well.
I think this is something that could be considered. This would still preserve less metadata than
tar
orcp -a
, because the latter two preserve e.g. file ownership, SELinux contexts, and possibly hard links as well.
I agree.
The whole process can be considered similar to copying between 2 air-gapped PCs with a flash-drive formatted to ext4. At least it sounds logical and fits architecture of qubes as separated PCs.
Absolutely agree that there should be a "copy_unsafe" option; ran into this issue now several times and it is extremely annoying to deal with...even when a file name has only Latin characters it sometimes crops up, not even to mention non-Latin ones.
I'll also add that UX is still pretty bad with the error message as the target qube is asked with a pop up dialogue, but the error / failure message is just an echo in the terminal; the last time I did qvm-copy
and it failed I actually lost the data, because I assumed that the error would be another pop up dialogue, which didn't happen, and the filename was only Latin characters (and some apparently non-standard apostrophes) and then I shut down the disposable thinking everything had been transferred when it wasn't (it was easy to recreate that data - this time -, but still...super bad UX).
I assumed that the error would be another pop up dialogue, which didn't happen
That should be fixed by https://github.com/QubesOS/updates-status/issues/4240 + https://github.com/QubesOS/updates-status/issues/4233
Edit: Those are for GUI file copy
(Since the change in behavior from 4.1 to 4.2 was intentional, instead of closing this as "not a bug," I've converted this issue into an enhancement request to allow the desired behavior in some way.)
Absolutely agree that there should be a "copy_unsafe" option; ran into this issue now several times and it is extremely annoying to deal with...even when a file name has only Latin characters it sometimes crops up, not even to mention non-Latin ones.
Is there a list of known Unicode grapheme clusters? Right now, combining characters are rejected to block Zalgo text.
Is there a list of known Unicode grapheme clusters? Right now, combining characters are rejected to block Zalgo text.
No idea, but the file name that was refused to be copied simply contained: ‘
and ’ |
as the problematic portions (when I replaced these with just spaces qvm-copy
worked), so nothing that looked weird or made me suspect that qvm-copy
would fail...so technically the problem was not Latin characters (the rest of the file name was Latin characters, these were just making the long file name more readable), but just an everyday file name without weird stuff that refuses to be copied.
Is there a list of known Unicode grapheme clusters? Right now, combining characters are rejected to block Zalgo text.
No idea, but the file name that was refused to be copied simply contained:
‘
and’ |
as the problematic portions (when I replaced these with just spacesqvm-copy
worked), so nothing that looked weird or made me suspect thatqvm-copy
would fail...so technically the problem was not Latin characters (the rest of the file name was Latin characters, these were just making the long file name more readable), but just an everyday file name without weird stuff that refuses to be copied.
I just checked and ‘
(U+2018) and ’
(U+2019) are (correctly) accepted. If it would not reveal anything sensitive, would you mind filing a bug report with the specific file name? To avoid ambiguity, it might be better to use C escape syntax for the non-ASCII bytes.
the last time I did
qvm-copy
and it failed I actually lost the data
Sorry to hear that.
So, let's note that the are case(s) of user's data loss due to this changes, as I would expect, and no real-life cases when user was "saved" because of this unnecessary filtering and copy fails. It is the case when possible security improvement makes the tool less reliable.
Moreover, during Qubes OS R1.0-R4.1 era the copy tools were working properly with no limitations on filenames and symlinks (which is now completely broken). They copied files and file trees as cp
does, and during all that time I have never seen any reports from users that somebody was infected, affected, or that somebody was asking much that copy will be burdened with additional filtering and would be unreliable.
The huge problem, in my opinion, is that currently when user starts qvm-copy
on 1 TiB directory, they NEVER can be sure that it:
cp
would)In all cases user will be required to make many additional steps manually and start the copy process again. Almost for nothing.
@DemiMarie I did some more testing and it turns out it was the "vertical bar" character that caused the issue; opened #8807 now.
@UndeadDevel QubesOS/qubes-linux-utils#109 should add that to the safe list.
As I understand it the motivation for this (according to this PR) is the following:
There is currently no restriction on allowed file names. This means that Trojan Source-style attacks might be possible.
However, the docs already hint at this possibility:
However, one should keep in mind that performing a data transfer from less trusted to more trusted qubes is always potentially insecure if the data will be parsed in the target qube. This is because the data that we copy could try to exploit some hypothetical bug in software running in the target qube.
So unless I am misunderstanding something, I don't think qvm-copy should prevent the user from shooting themselves in the foot.
To elaborate more on this, I never understood qvm-copy
as a fundamental sanitizing operation. It should match user's expectations from the cp
command generally. (I get the symlink exclusion part, so let's ignore that)
To me personally, qvm-copy
should not give any chance to filenames to interfere with the qfile-agent in the receiving end. If another program (or user) can be exploited or tricked, that's beyond the responsibility of qvm-copy and thus it should not make judgements (blocking copies).
As some have highlighted, the user impact is severe. I've had this on an innocent looking music file, but if we're talking about a journalist, who receives potentially untrusted documents from sources and wants to copy to another qube, this is a major regression. The workflow would be severely hampered since if there's any non-perfect UTF character in the name the whole source archive it won't be movable from the source qube at all, except with the use of trickery (like zipping).
Given the high impact and (at least to me) relatively little explanation of what practical threats this can mitigate against, my recommendation would be to reconsider this or provide a workaround (preferably GUI-based), or perhaps make the "unsafe" (read: 4.1-like) the default and perhaps a non-default qubes.FileCopySanitizeFilenames
policy.
To elaborate more on this, I never understood
qvm-copy
as a fundamental sanitizing operation. It should match user's expectations from thecp
command generally.
I completely agree with @deeplow! I would press like button multiple times if it were possible.
The workflow would be severely hampered since if there's any non-perfect UTF character in the name the whole source archive it won't be movable from the source qube at all, except with the use of trickery (like zipping).
True, the current breaking change for both ext4-valid filenames and valid symlinks is an mistake and a regression.
BTW, ext4
is char-set agnostic and allows any bytes as filenames except '\0' by design.
This sanitizing idea is only making things worse. No examples of infection of Qubes OS users has ever happened this way, but the copy (like in cp
) logic is already broken in R4.2 for everybody. And users have to pack and unpack files, I have not seen anybody actually rooting for this breaking change. Not like some people among Qubes OS users were hoping for this change at this point.
Given the high impact and (at least to me) relatively little explanation of what practical threats this can mitigate against, my recommendation would be to reconsider this or provide a workaround (preferably GUI-based), or perhaps make the "unsafe" (read: 4.1-like) the default and perhaps a non-default
qubes.FileCopySanitizeFilenames
policy.
I agree with this.
@marmarek can this really be reconsidered? Is it possible?
The first step: to revert default logic, so less users wound be affected by it and notice problems. Users used the copy tool for 10+ years in Qubes OS with NO complains.
Second step: consider a way to make a user choice for optional copying with sanitizing if user really needs it.
I agree completely with both of those points. Mostly with the first, but if we really need the second, it can be like a qvm-copy
with an --extra-paranoid
option or something...
After some discussion on Matrix (thanks @xaki23 and @DemiMarie for the ideas!), here is a proposal:
Keep filename filtering in qubes.Filecopy
, but introduce new service qubes.UnsafeFilecopy
without filename filtering (but with symlinks checks, at least for now). The default policy for the "unsafe" one will (by default) be the same as for the filtered one - "ask", but can be changed.
Additionally, qvm-copy
will gain a new --sanitize
option, that will replace forbidden characters with _
or similar allowed character.
Then, qvm-copy
can check for forbidden chars, if none found, will use qubes.Filecopy
service. But if there are some, we have two possibilities:
qubes.UnsafeFilecopy
- depending on the policy, it will either result in a similar prompt (in default setup), but one can configure it also to deny without any prompt. If it doesn't exist in the target (for example updates not installed), it will fail, but so will the current qubes.Filecopy
--sanitize
option is given: use qubes.Filecopy
(filtered one), but replace forbidden chars with _
or such, so the copy will still succeeded, just with sanitized file namesThe above proposal, with all updates installed (in the source qube, target qube and dom0) should allow unfiltered filenames by default, but should still allow somebody to enforce filtering if they want to. There are a few corner cases if not all is updated (like only source qube but not the target one), but those will affect copying with "unsafe" filenames only, which are currently rejected anyway (so, technically it isn't worse than the current situation).
I agree completely with both of those points. Mostly with the first, but if we really need the second, it can be like a
qvm-copy
with an--extra-paranoid
option or something...
Thank you for support. I also think the first step is crucial, as the whole change was not well-though through in advance.
--sanitize
option, that will replace forbidden characters with_
or similar allowed character.
This is user data loss. Like just recently math chars where added to the allowed. So, they would have been lost in this case. Filename is completely user's data, quite important sometimes and can break things for million of cases.
I am not against this --sanitize
flag, but just made it clear that it should be used with caution and for sure not as a default approach by anyone.
There are a few corner cases if not all is updated (like only source qube but not the target one), but those will affect copying with "unsafe" filenames only, which are currently rejected anyway (so, technically it isn't worse than the current situation).
Do you mean it won't be able to copy properly if template or standalone qubes is recovered and not updated? "Current situation" is "will not able to copy" as in R4.2, or you mean "unsafe successful copy" as in R4.1?
Do you mean it won't be able to copy properly if template or standalone qubes is recovered and not updated? "Current situation" is "will not able to copy" as in R4.2, or you mean "unsafe successful copy" as in R4.1?
I mean it won't magically change if you don't install update containing the change. If you install updates (especially on the receiving side), it will behave similar to R4.1 by default.
Maybe I did not understand your proposal completely, but would appreciate if you consider my vision (maybe it does not contradict yours that much):
Firstly, revert default behavior to as it was for 10 years till R4.1 for completely updated (dom0 and templates) system. The change should be made gradually (read further points).
Consider adding a cool feature to the source qube itself: the sending qube will run checks and show the information about the non valid utf-8 chars being used or having symlink in the selected files. User sees the problems, with the information about filenames, it is already a huge help, maybe user will fix filename manually anyway. And then user decides what to do, right inside the source qube, "cancel operation", sanitize or "proceed copying as is". In case of GUI version, the decision should be made with GUI buttons. Great UX.
Copy as is should be, at least at first, a default policy for R4.2, but user *should already be able modify policies to limit source qube or target qube, make it stricter by forcing some sanitized copying only. The policy implementation is out of question. I will solve problem for more paranoid/concerned users already in R4.2.
Only later, maybe in R4.3, when users ARE already USED TO the message about invalid chars or symlinks, and being familiar with the whole sanitising thing, the policy may be changed to a stricter one, so that unsafe copy would require to manually change configs/policy in dom0
or qube settings.
There must be a flag like --copy-as-is
that will copy any files, including symlinks, as cp
does and as user at least currently actually wants and expects. Must-have thing to my opinion.
If qube is outdated, without any changes, made in R4.1 or early R4.2, it should allow as-is copy. At least for now. Is it possible? Then, maybe for R4.3, it will be changed, but the positive moment would be: the most of qubes will be already updated by that point and the code in them will be already aware about new approaches.
I also think the first step is crucial, as the whole change was not well-though through in advance.
shit happens. i can strongly recommend joining the sausage factory (matrix and/or irc chats) for stuff like this. what marmarek summarized as "discussion on matrix" was an hour of "vigorous/frank exchange of views" by people with very different perspectives on the problem. to the point where someone not actively participating in the convo commented "i'm glad i joined this room. seeing discussions in realtime about qubes development is fascinating". very great example of https://en.wikipedia.org/wiki/It_takes_a_village
--sanitize
option, that will replace forbidden characters with_
or similar allowed character. This is user data loss. Like just recently math chars where added to the allowed. So, they would have been lost in this case. Filename is completely user's data, quite important sometimes and can break things for million of cases. I am not against this--sanitize
flag, but just made it clear that it should be used with caution and for sure not as a default approach by anyone.
the --sanitize was my suggestion. yes, it shouldnt be the default (thats why it is a very optional option), but also something that is trivial to implement (once you have source side check the filename) and has legit usecases of the "i am too lazy to rename this, i need the file in a different qube, and also dont really care if it is the exact filename as long as it is mostly recognizable on the dest side" kind.
2. Consider adding a cool feature to the source qube itself: the sending qube will run checks and show the information about the non valid utf-8 chars being used or having symlink in the selected files. User sees the problems, with the information about filenames, it is already a huge help, maybe user will fix filename manually anyway. And then user decides what to do, right inside the source qube, "cancel operation", sanitize or "proceed copying as is". In case of GUI version, the decision should be made with GUI buttons. Great UX.
@marmarek
About this point, another thing: I was considering to make an another contribution by developing a proper copy progress dialog for source qube. One like users have in Windows
, KDE
and etc. With progressbar (we have it now), file size, files and folders count, time estimation, maybe even graph, too. Additional information should be available by click "show more" or something.
It can be made in python
+glade
as I see Qubes OS project used it recently for new LVM-password-change tool. I know C/C++/Python/Bash and other languages, so other options like C++/Qt also can be considered if they are preferable.
I stopped thinking about it when I understood that current copy approach is quite limited on the backend: the copy agent just outputs the progress as strings to stdout and is completely uncontrollable and "unmonitorable" beyond that.
Like I would love to provide user with:
KDE
), so user would see the progress is 100% finished with success,KDE
, mc
and Far Manager
show.So, if some changes are going to be made to cross-VM copying, can something be done, to archive this level of comfort for user?
About this point, another thing: I was considering to make an another contribution by developing a proper copy progress dialog for source qube. One like users have in
Windows
,KDE
and etc. With progressbar (we have it now), file size, files and folders count, time estimation, maybe even graph, too. Additional information should be available by click "show more" or something.
Sounds cool indeed, but it's getting offtopic here, please create another issue for this (you can use the "Reference in new issue" option in the comment menu). As for your questions, python+glade is okay. And indeed the qfile-agent would need to be extended to report more than just percentage, but also current file (and its percentage?).
@marmarek: Does this require scanning the filesystem for forbidden characters twice, or is it instead possible to terminate the transfer and initiate a new one?
It needs to scan twice. The qvm-copy
tool scans all the files already (to calculate total size), so this scan can be replaced with something that additionally will check filenames.
After the scan, the qvm-copy
tool can (in future) ask user:
The files you are trying to copy have characters that not in the safe list.
Are you sure you want to copy them?
<Do not copy anything> <Copy these files anyway>
And perhaps something like press p to print list of files with unsafe file names
...
After the scan, the
qvm-copy
tool can (in future) ask user:The files you are trying to copy have characters that not in the safe list. Are you sure you want to copy them? <Do not copy anything> <Copy these files anyway>
Unfortunately, this prompt needs to be done on the destination side, because otherwisee a malicious source VM can always respond with <Copy these files anyway>
.
After the scan, the
qvm-copy
tool can (in future) ask user:The files you are trying to copy have characters that not in the safe list. Are you sure you want to copy them? <Do not copy anything> <Copy these files anyway>
That's more or less what the current design (described in this ticket) does: source qube scans files before calling any service, and based on the scan result chooses which service to call (the one that allows only "safe" names, or the one that allows any name). Then, the user needs to confirm in the qrexec prompt, so the decision is not fully under the source qube control. While the subtle difference in the service name is easy to miss, this approach allows further extensions:
@marmarek My current plan, based in discussions on Matrix, is to unconditionally disallow ASCII control characters in filenames. They are just too dangerous for anyone who uses the CLI (as opposed to only GUI tools).
unconditionally disallow ASCII control characters in filenames. They are just too dangerous for anyone who uses the CLI (as opposed to only GUI tools).
How to copy these files using terminal than?
I think terminal tool can avoid showing this filenames in the terminal, it is such high concern (e.g. you may reject @UndeadDevel suggestion, though I consider it to be useful and fine). But being able to copy files as is (if policies allow that) should be considered must-have, imho, for both GUI and especially terminal.
Now when considering interaction with already existing user policies and policy editor, I wonder if it wouldn't be better to add an argument to existing qubes.Filecopy
service, instead of creating new one. For example qubes.Filecopy+allow-unsafe-names
. The benefit of this approach is that existing policies (which accept any argument) would work unchanged already. And there would be still one rule per source+destination pair which would either allow any argument (for old R4.1 behavior) or allow no argument (for filtered behavior).
@xaki23 do you remember if we considered this approach before? Was there some issue with that?
The benefit of this approach is that existing policies (which accept any argument) would work unchanged already.
Example: I, personally, have no use for allowing unsafe names, so I'd prefer to leave them disallowed on my own system. My understanding is that my current system (4.2) already disallows them by default. If you were to implement this change as described, it sounds like I would have to somehow modify my RPC policy rules in order to disallow them again. I'm guessing most users in my situation wouldn't know to do that.
Automatically making existing policies (including users' own custom rules) more permissive sounds like an anti-secure-by-default approach.
@andrewdavidwong I thought, based on name, the point of qubes.Filecopy+allow-unsafe-names
is to make allowing unsafe filenames optional, not the default, am I right @marmarek?
But maybe I am not getting it right.
No, the idea is to make the "unsafe" variant the default. I use "unsafe" in quotes because the practical impact is rather small (exploiting this requires somebody finding another bug in a font rendering engine - those happen, but are very rare). On the other hand, as proven in several reports, hitting issues with copying files with for example Arabic names, users fallback to much less safe options (like packing into zip).
And the idea of using service name or service argument is that, besides being controlled by policy, each copy confirmation prompt will include info which operation is being used (so, even when "unsafe" is enabled, there is still explicit info if some "unsafe" files are actually being copied).
On Sat, Jun 01, 2024 at 02:08:02AM -0700, Marek Marczykowski-Górecki wrote:
And the idea of using service name or service argument is that, besides being controlled by policy, each copy confirmation prompt will include info which operation is being used (so, even when "unsafe" is enabled, there is still explicit info if some "unsafe" files are actually being copied).
I understand your argument that this is a minor reduction in risk, but I agree with Andrew that such changes should not go unannounced. We dont at the moment have a prominent way of flagging up such changes to users on their systems.
I assume that this will all be covered by the Global prefs GUI, and it should be easy to include a check box to remove this option as default for those users who do not want to have it.
Qubes OS release
4.2
Brief summary
moving/coping files was working ok in 4.1 regardless their name, now its rejected due to the name of the file used.
Steps to reproduce
Try to move/copy a file in none latin letters
Expected behavior
To move/copy file regardless what characters used to name it.
Actual behavior
Tasks
More detailed design in https://github.com/QubesOS/qubes-issues/issues/8332#issuecomment-1912403043
qfile-unpacker
to have extra options for disabling various restrictions (independently): file names validation, symlink target restriction (refusing absolute symlinks or pointing outside of target directory)qubes.UnsafeFilecopy
qrexec service with file names validation disabled (but symlinks validation enabled)