Closed Prof9 closed 2 years ago
Hm... maybe this should just be an error instead. When opening a file with the copy signature I usually intend to use an unmodified version of it as a base. Silently opening it for in-place editing may be unexpected. What do you think?
TL;DR: the directive is called .open
, not .opencopy
, therefore I think that opening the file should be the default unless a copy is required (due to input and output being different).
armips uses std::filesystem::copy_file
(or a reimplementation of it) which fails when from
and to
are the same. So the current behavior is that this warning is emitted:
test.asm(6) fatal error: Could not copy file <path>
I think there are two ways to interpret the following operation:
.open "in.bin","out.bin",0
in.bin
, modify it, then write it to out.bin
.in.bin
to out.bin
, then open out.bin
and modify it in place.The second is what is currently documented in the Readme but I would argue the first is just as, if not more valid:
.open
, not .copy
or .opencopy
; the copy operation is implicit. (Albeit documented in the Readme.).open
with 2 and 3 arguments are overloads of the same directive.inputname()
is added in the future, you would probably set both inputname()
and outputname()
to the same file if you use the 1-file invocation of .open
.As for the second interpretation, it could be argued whether copying a file to itself constitutes an error or a no-op. If you consider it a no-op, then the two interpretations do effectively the same. (Though I do agree it makes more sense to think of that as an error.)
Note that in the past, you could do this on Linux but not Windows. (See #99.) With the switch to std::filesystem::copy_file
, this is no longer the case, as that function explicitly errors in this situation.
Additionally, I frequently use armips as part of a larger build pipeline where I invoke it something like this:
armips src.asm -strequ ROM_IN "%ROM_IN%" -strequ ROM_OUT "%ROM_OUT%" -strequ TEMP "%TEMP_FLDR%"
With this, I might start with ROM_IN
and ROM_OUT
being different files (the base ROM and the patched ROM), but as the project grows larger, I might need to run multiple passes on the same ROM. Then it makes sense to have ROM_IN
and ROM_OUT
be the same file (e.g. both pointing to "%ROM_TEMP%"
), as it models a single step in the pipeline. Otherwise I would have to ensure each step reads from and writes to different files, which makes it annoying to reorder things later.
If I wanted to account for this, currently I'd have to do something like this in the .asm
file:
.if ROM_IN == ROM_OUT
.open ROM_IN,0x8000000
.else
.open ROM_IN,ROM_OUT,0x8000000
.endif
This, however, is not bulletproof as ROM_IN
and ROM_OUT
might ultimately point to the same file but have different relative paths. I guess there could be an absolute()
function in armips that converts a relative path to an absolute path, though.
I'm not sure I understand the use case fully here, is there a reason to run the same assembly script sometimes in-place and sometimes making a copy, with variable input/output filenames like that? I would imagine that either that macro or detecting/handling this at a higher level, in the scripting language being used for this pipeline, should do it.
As for the fix, I think if there is a full paradigm shift to read input file
> operate in memory
> write to output file
, then this sort of change makes sense. But silently eliding a copy operation instead of erroring out of what seems (to me) to be an unintended usage of the directive might be problematic.
This, however, is not bulletproof as
ROM_IN
andROM_OUT
might ultimately point to the same file but have different relative paths. I guess there could be anabsolute()
function in armips that converts a relative path to an absolute path, though.
That wouldn't be any less true of this pull request, right? Symlinks etc. would still get in the way.
My personal opinion: I like that armips is designed to so easily not modify in place. But maybe I'm for some reason chaining various tools, so I guess it could make sense to allow this form of modify in place. It does feel like a probable mistake, though...
Anyway, if this change is made... it seems like this ought to use std::filesystem::equivalent()
for comparing the two paths, not a straight ==
.
-[Unknown]
That wouldn't be any less true of this pull request, right? Symlinks etc. would still get in the way.
You're right, I guess std::filesystem::absolute()
does not resolve those either. I believe std::filesystem::equivalent()
does, though (this is also used internally by std::filesystem::copy_file
), so I'd have to replace the ==
with equivalent()
in that case.
I would imagine that either that macro or detecting/handling this at a higher level, in the scripting language being used for this pipeline, should do it.
It's true that there are other ways around it and checking in the pipeline script also makes sense. Mainly I think opening the file in-place in this situation is a more useful behavior than simply erroring out, and makes more sense given the name of the directive (.open
/.openfile
).
However, it sounds like people prefer that it gives an error. In that case, maybe this should be made more explicit in the Readme then? E.g.:
If two file names are specified, the assembler will first copy the file from
OldFileName
toNewFileName
, then openNewFileName
. In this case, if the copy operation fails, e.g. because the two paths point to the same file, an error is produced.
Alternatively, there could be a directive a la .relativeinclude on/off
that allows the user to opt into the behavior, e.g. .ignorecopyerror on/off
.
After discussion with @Kingcom, it was decided this will remain an error and the Readme will be made more explicit on this point.
Looks like std::filesystem::equivalent
is not supported on macOS? Not really sure what the problem is there. (See action log)
I've removed the equivalent
call for now; it just means the line number on the .open
error will be wrong, since the error will be thrown during the encoding stage instead of the validation stage.
Turns out the implementation on Windows and Linux just doesn't adhere to C++20; fixed by switching it out for the correct overload.
This fixes stuff like:
by opening the file directly instead of doing a copy when the input and output files are the same.
The check also could have been implemented in
parseDirectiveOpen
, but you will probably want to callgetFullPathName
to catch dumb edge cases like:CDirectiveFile::initCopy
already callsgetFullPathName
on both paths anyway so I placed it there.