Closed PixelsByLucas closed 1 year ago
Do you mention body
about this body? https://hiraokahypertools.github.io/msgreader/typedoc/interfaces/MsgReader.FieldsData.html#body
Including null character in body can be possible in considerable reasons.
Including null character in body might be edge case.
msgreader introduces value comparison tests. Compare JSON files that are made by property key-values extracted from msg file. https://github.com/HiraokaHyperTools/msgreader/tree/master/test Currently there is no known case of including null character in body .
If you have enough time to check this case, you can try another msg parser library like msg-parser - npm
If you want to browse direct content of msg file, 7-Zip File Manager for Windows is the best tool to explore.
Select __substg1.0_1000001F
and type F3 key to launch notepad or such.
Reading the __substg1.0_1000001F
file with binary editor may help to ident problems.
Property type of __substg1.0_1000001F
is PtypString
. It is a string of Unicode characters in UTF-16LE format encoding.
[MS-OXCDATA]: Property Data Types | Microsoft Learn
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Is it possible that FieldsData body can produce a null character like "\u0000" instead of the actual email body?
I noticed this in my logs from a PostgreSQL exception: invalid byte sequence for encoding "UTF8": 0x00. This is caused by trying to insert a null character into a text field. The value I was trying to insert was coming from FieldsData body. I noticed that the same email will sometimes produce this result, while other times the body is parsed fine and everything works as expected. Also when this issue happens, other data like messageDeliveryTime & recipients is parsed correctly.
Any idea what could be causing this?