Closed witwall closed 6 years ago
@witwall thanks for the report. To investigate I'll need a datafile that illustrates this problem. Can you attach one here or email one to me?
Also your report shows how things are wrong with tnef -t winmail.dat
but then you say it has "right name" with the same command? Please clarify.
sorry, my typo, wrong with extract, but right with -t(just show name).
and I will send you a sample.
Here's one. winmail.dat.gz Apparently the filenames are in big5, which create gobbeldygook filenames when extracted on a UTF-8 system.
I haven't looked at the data file; but any ideas how one would know what encoding the filenames are in?
./tnef -t winmail.dat>list.txt
open list.txt with vcode, it should be big5, but still have wrong characters.
sorry, for privacy reason, i have to remove it.
and here is my GBK version example,
#with correct encoding
./tnef -t winmail.dat>list.txt
maybe we can get the right encoding/Codepage through this attribute,
attOEMCODEPAGE 0x9007 OEM Codepage
for example, in this example, we can get it code page is rcpg936a, means GBK(oem code page 936)
and the big5 example, it is rcpg95(should be 950?)
Also consider many users will be extracting on UTF-8 systems and it would be nice to convert the filenames by default. And only leave them raw if an option is given.
Thanks for all the info. Full disclosure: I am not going to take much, if any, time to look into this until the very end of they year when I have some time off. No guarantees that I will even release anything to fix this issue.
OK. I don't get such files often. By the way I notice wget has --local-encoding=encoding --remote-encoding=encoding
Regretfully I am going to not fix this issue and close it.
I do not feel I have the time & energy to properly handle code pages in TNEF. I am open to reviewing any patches to add such features.
I will be adding a little special debugging output so that CodePage data is easier for anyone to identify and I will update the README & man page to make it clear that TNEF makes assumption about the data being in some Unicode encoding.
Sorry @witwall & @jidanni . Thanks for submitting this issue and the work you've put into it.
if attachments named in Chinese, would not get the right name,
here is the output files name when run
tnef winmail.dat
and here is the right name when run
tnef -t winmail.dat
btw, I am using Mac OS X