Closed Cirn09 closed 3 years ago
Hi. Thank you again for giving your feedback. It is much appreciated.
I was unfamiliar with the file
command or libmagic
. Since I mostly use Windows, the file type's is usually defined by the file's extension. I think this is a great idea and it should be implemented. I am expecting the library that provides the "guessing or the file's type" is only peeking into the first bytes to get the file's header or something like that. This proposed solution would also help implementing the use case you specified in #50.
Looking at your resources, it might not be strait forward to implement such a feature. nscaife/file-windows requires Bash for Windows (LXSS) under Windows 10 build 14393 (or maybe Linux Subsystem for Windows) which I do not have. And I develop on Windows 7. The sources from http://gnuwin32.sourceforge.net/packages/file.htm might be easier to compile since the source code is provided and it only have dependencies on zlib and regex. I have also take a look at file/file and I am not certain I would be able to compile this on Windows. It is makefile based. Maybe with msys2 or mingw64 but hanling licences requirements with these compilers is a nightmare. In worst case scenario, I might directly use the binaries from http://gnuwin32.sourceforge.net/packages/file.htm.
I have other priorities at the moment regarding my other repositories. I know I have already asked you but would you like to contribute to the project ? I am open to pull requests. I would also appreciate if you could find out how to compile any of the proposed resources on Windows. CMake might be a good candidate. If not, I will eventually have a look at this before implementing issue #50.
Edit: syntax.
In fact, I don't know how to compile file for windows at all. But I found this: https://github.com/julian-r/file-windows It uses appveyor and cmake, MIT License. And I have no problem compiling and testing locally.
file-windows seems to be just for compiling the file
, although libmagic
is also compiled during file compilation.
So I learned CMake, modified the CMakeLists.txt
of that project, and submitted a PR.
That's a great finding! I will take a look at this. I will likely be able to integrate this without too much hassle. Please allow me some time to implement this. I do not have much time right now and I have other priorities. I will have more free time in 3 weeks and will start working on this. If the PR is not accepted by the time I start working on this, I will create a temporary fix on my side in order to compile file-windows. I am really glad that you contribute to the project with amazing new ideas. I never though of this.
May I ask why you though about naming the property filetype
instead of something like mimetype
. To me, the file's type is more associated with File Explorer's "Type" (see below for an explanation). If I understand correctly, libmagic
library should return a mime type based on the content of the file. If the first few bytes contains <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
, the library should report as text/xml
or maybe application/xml
. So, from my experience, I think it would be more appropriate to name the property mimetype
. I could even implement File Explorer's type and name it selection.filetype
. But I could be wrong.
A brief explanation of File Explorer's file type: If you set your view in "Details" mode, the Type
column indicates the type's of the file. This type is based on the file's extension. For example, all .txt files are "Text Documents". This type name changes based on the associated program that opens the file. For example, I have installed IrFanView, which now handles most image files. If I look at the type for .bmp files, I see IrFanView BMP File
.
Indeed, this attribute is more appropriately called minetype
.
In addition, I wrote about the pr while learning cmake. It still has some problems and something wrong in exporting information. I think I should be able to fix them. This is a good practice for learning cmake.
I seem to be unable to create an issue in your Cirn09/file-windows repository. Can you update your Findpcre2.cmake in your repository to support building in Debug configuration ?
Basically, you need to modify the find_library
lines with the following:
find_library(PCRE2_LIBRARIES NAMES pcre2 pcre2-8 pcre2d pcre2-8d PATHS "$ENV{PCRE2_INSTALL_DIR}/lib")
find_library(PCRE2_POSIX_LIBRARIES NAMES pcre2-posix pcre2-posix-8 pcre2-posixd pcre2-posix-8d PATHS "$ENV{PCRE2_INSTALL_DIR}/lib")
(I added the d
postfix after the expected library names.)
I seem to be unable to create an issue in your Cirn09/file-windows repository. Can you update your Findpcre2.cmake in your repository to support building in Debug configuration ? Basically, you need to modify the
find_library
lines with the following:find_library(PCRE2_LIBRARIES NAMES pcre2 pcre2-8 pcre2d pcre2-8d PATHS "$ENV{PCRE2_INSTALL_DIR}/lib") find_library(PCRE2_POSIX_LIBRARIES NAMES pcre2-posix pcre2-posix-8 pcre2-posixd pcre2-posix-8d PATHS "$ENV{PCRE2_INSTALL_DIR}/lib")
(I added the
d
postfix after the expected library names.)
Done!
in addition: there are still bugs in #94, I will fix it later
Don't worry about changing Findpcre2.cmake
. I modified install_libmagic.bat
to build in Release mode. See 0d1f96aeb2cb879e7c6e5702914a795bc9c371f6 for details.
I will leave GetMGCPath()
as is. The function GetCurrentModulePathUtf8()
is implemented in shellext.dll. You were unable to link from shellanything.dll to shellext.dll because the dependency is the other way around; shellext.dll (the application) depends on shellanything.dll (the core library). That why shellanything.dll will never be able to link to a function in shellext.dll; it would create a circular reference. Copy and pasting is perfectly fine. There is no GetCurrentModulePathUtf8()
or GetCurrentModulePathUtf8()
in RapidAssist either.
I have locally enabled all properties provided by libmagic in a new unit test. See 716006040e699aebd16f179fe9f37faf1ff623f2.
For example, for _shellanythingunittest.exe, the following properties are registered:
selection.mimetype=application/x-dosexec
selection.description=PE32+ executable (console) x86-64, for MS Windows
selection.extension=exe/com
selection.charset=binary
For configurations/default.xml , the following properties are registered:
selection.mimetype=text/xml
selection.description=XML 1.0 document, UTF-8 Unicode text, with CRLF line terminators
selection.extension=???
selection.charset=utf-8
I don't see how selection.extension
could be useful. It there an actual use case for selection.extension
? I will leave it commented for the moment. I also renamed the property to selection_libmagic_ext
. I think it was too much confusing with selection.filename.extension
.
Issue #94 is merged (a.k.a. done). I am assuming that you are referring to your comment about "path with Chinese character". Is this bug introduced when you implemented support for libmagic ? I would prefer discussing about remaining bugs in this issue if it is related to libmagic. Otherwise, we could open another issue specific to this problem. This would allow to close one issue at a time and not having an issue "open forever" since we keep finding new bugs while we introduce new features. Also, I will have difficulties working on this problem since I do not have a Chinese code page (or a Chinese font) installed and I am unable to read Chinese.
You did not provided much details about a problem with Chinese characters. Here are some of my assumptions about the problem:
Quickly looking at the code, the problem might be located in FileMagic.cpp#L65. You are providing an utf-8 encoded path to the magic_load()
function. I seriously doubt that libmagic is expecting utf-8 encoded strings. I have not verified but I may be expected an ANSI string. In other words, a string encoded in your current code page. For example, since I live in Canada, my current code page is 850 (I think it matches or is close to Windows-1252 encoding) :
C:\Users\antoine>chcp
Active code page: 850
For example, multiple files on my computer contains the é
character (I usually speak French). This character is actually ASCII character 130. In other words, I do not need to call GetModuleHandleExW()
to get a path that contains such a character. The function GetModuleHandleExA()
works perfectly fine.
In line 65, you are converting a wide-character encoded path (from calling GetModuleHandleExW()
) to utf-8. You should try to use GetModuleHandleExA()
instead to get an ANSI encoded string. This would prevent converting to utf-8. Directly feed this string to function magic_load()
and see if it is working.
It might be worth giving it a shot.
The encoding of the strings returned from functions FileMagic::GetMIMEType()
, FileMagic::GetDescription()
, FileMagic::GetExtension()
and FileMagic::GetCharset()
is not specified. Since all calls to libmagic goes through FileMagic.h, I think FileMagic.h api should returns utf-8 encoded strings (ready to use by ShellAnything. Internal strings are utf-8 encoded).
I think these functions are the best location where the encoding conversion should occur. I do not know much about libmagic but I would expect that returned string when calling const char *result = magic_file(magic_cookie, path.c_str());
is ANSI encoded. Again, strings encoded in the current code page. They should be converted to utf-8 to be able to compared with a string coming from an xml file:
const char *result = magic_file(magic_cookie, path.c_str());
if (result == NULL)
{
ShowErrorMessage("ShellAnything(libmagic) Error", std::string("ERROR loading file: ") + magic_error(magic_cookie));
return std::string();
}
else
{
return ra::unicode::AnsiToUtf8(result);
}
Note the use of ra::unicode::AnsiToUtf8()
.
Context.h
. Indeed, this header is intended to be part of a public api (it is stored in include\shellanything
while you stored FileMagic.h
in src
. This was a problem. I moved the "include FileMagic.h" inside Context.cpp.magic.mgc
every time we create a new Context. Also, I was concerned about calling magic_open()
twice without a magic_close()
between the calls. This situation was reproducible in unit tests when we created 2 or more Context instances. The class name and source files were also renamed to FileMagicManager.h and FileMagicManager.cpp to reflect this change.I have created #95. I would like to get your feedback about it. I know you proposed this new feature in order to solve a use case of yours. Having a higher limit of selected files might break this use case.
I think we (mostly you) properly implemented this feature. I also created unit test to detect future breaking changes. The selection based properties section in the user manual was updated. I have also created two new sections in the user manual for helping people using these new properties: MIME types, general description and charset and How to get my files MIME types, general description and charset ?. Do you consider this issue as solved ?
Feature mostly implemented in the following commits: f6c552a6145cdf72513a5ff9cf2a57716c10af32, 716006040e699aebd16f179fe9f37faf1ff623f2, 380c60f93dd2640afcf0cefb7652503c9b67e2ef, b1603356193f67292e16f7ea716e723e90b90d6a, b893933b11c13b3c59d4114c5b81619cfd8be061.
Closing.
Is your feature request related to a problem? Please describe. Like 7z's context menu
Describe the solution you'd like
selection.filetype
-> mime type of selected file for example: "zip file" ->application/zip
some useful resource: http://gnuwin32.sourceforge.net/packages/file.htm https://github.com/nscaife/file-windows https://github.com/file/fileDescribe alternatives you've considered N/A
Additional context N/A