Closed drunkenQCat closed 1 year ago
Hi and thanks for your feedback.
Could you please explain which field you're saving to ? WAV has many chunks that follow different specifications.
public void WriteMetaData()
{
foreach (var item in LogList)
{
foreach (var bwf in item.bwfList)
{
Track tr = new(bwf.FullName);
WriteAdditional(tr, "ixml.SCENE", item.scn + "-" + item.sht);
WriteAdditional(tr, "ixml.TAKE", item.tk.ToString());
WriteAdditional(tr, "ixml.NOTE", item.scnNote + "," + item.shtNote);
WriteAdditional(tr, "ixml.CIRCLED", (item.okTk == TkStatus.ok) ? "TRUE" : "FALSE");
WriteAdditional(tr, "ixml.TAKE_TYPE", (item.okTk == TkStatus.bad) ? "NO_GOOD" : "DEFAULT");
WriteAdditional(tr, "ixml.WILD_TRACK", (item.tkNote.Contains("wild")) ? "TRUE" : "FALSE");
tr.Description = item.tkNote;
tr.Title = item.shtNote;
tr.Save();
}
}
}
void WriteAdditional(Track tr, string tag, string content)
{
if (tr.AdditionalFields.ContainsKey(tag)) tr.AdditionalFields[tag] = content;
else tr.AdditionalFields.Add(tag, content);
}
the random code happened in ixml.NOTE and question mark in description and title.
I tried to modify the source to make it enabled to write the utf8 information I need.
it fixed. the picture I show in Details is the problem of waveagent. the utf8 information showed correctly in metadata management softwares. here is an example in reaper:
the title is still random code in File Explorer because the default encoder of my system is GB2312.
that's the problem. I read CharsetDetector/UTF-unknown#143 and learn that it maybe the problem caused by this. So it is caused that the Settings.DefaultTextEncoding did not cover the other fields?
I tried to decipher these garbled codes and found that they were encoded by ISO-8859-1 but decoded by utf8
The places where you found garbled text are read and written using ISO-8859-1
, which does not support oriental characters.
I've done that because of what specifications say :
BEXT
(used for the description field) : Specifications say the string fields should be written using ASCII
. However, ASCII
being a subset of UTF-8
, we can switch to UTF-8
without any issueπ LIST INFO
(used for the title field) : Specifications say the string fields should be written using ASCII
. However, ASCII
being a subset of UTF-8
, we can switch to UTF-8
without any issue π iXML
structure, which is already UTF-8
-encoded π the title is still random code in File Explorer because the default encoder of my system is
GB2312
.
Precisely. Western versions of Windows use ISO-8859-1
as their default encoding. They assume WAV metadata are encoded using ISO-8859-1
, which works because WAV metadata is usually encoded using ASCII
, which is a subset of ISO-8859-1
.
Your version of Windows might be expecting GB2312
, which is not compatible with UTF-8
, hence the garbled characters displayed on the Explorer.
=> Another way of fixing that issue and make Windows happy would be to use Settings.DefaultTextEncoding
instead of UTF-8
in the library code, and set Settings.DefaultTextEncoding
to System.Text.Encoding.GetEncoding("GB2312")
in your application code.
That would fix the issue with your Windows, but would completely deviate from the BEXT
and LIST INFO
specifications, which would make the text you save unreadable on a western computer. That's why I'd rather hardcode UTF-8
as suggested above.
Do you agree with me on that one ?
I read https://github.com/CharsetDetector/UTF-unknown/issues/143 and learn that it maybe the problem caused by this.
This has nothing to do with WAV files. UTF-unknown is only used by the library to detect CUE sheets encoding.
Thanks for your detailed explaination, it answered a lot of problems. And I have to appologize for my ambgious description. I totally agree the answer, the random code on windows explorer in fact dosen't matter in sound production, I have felt the benefit of utf8 especially when I cooperate with others whose OS is macOS.
Beside, I finally find that the most important bug:
all the Chinese metadata written in my bext turned into question marks, which in binary 3F
is actually caused by
WavHelper.writeFixedTextValue(description, 256, w);
which uses Latin1Encoding as encoder to utf8 text. I inferred that Lain1Encoding.GetBytes(utf8Text)
may return 3F
(question mark) when out of range.
I varified the problem: It is actually caused by GetBytes.
Perfect, thanks for confirming π
I'm gonna publish a fixed version in the following days. Stay tuned~
Fix is available on today's v4.34
The problem
When I was writing some Chinese metadata to a wav file, the metadata written in was some random code. I tried to decipher these garbled codes and found that they were encoded by ISO-8859-1 but decoded by utf8. Besides that, all the Chinese metadata written in my bext turned into question marks, which in binary 3F. I am wondering why. Is there any way to avoid writing garbled code?
Environment
tested on codespace and windows in dotnet 7
Details