HiraokaHyperTools / msgreader

35 stars 9 forks source link

RangeError: Offset is outside the bounds of the Dataview #15

Closed mynpmpackage closed 3 years ago

mynpmpackage commented 3 years ago

RangeError: Offset is outside the bounds of the Dataview

Not for all email seeing this error but only one email which has only 3 lines of short text content as body. Not sure what makes this email as different. Only when parser tries parsing this short email it fails.

I see there is a while loop in function fieldsRootProperties method of MsgReader.ts . I am able to overcome this issue when I have the below condition before reading propertiesDs as array but not sure why the loop/propertiesTag runs for zero.

while (!propertiesDs.isEof()) { const propertyTag = propertiesDs.readUint32(); if(propertyTag!=0) { // const flags = propertiesDs.readUint32(); const arr = propertiesDs.readUint8Array(8); const dataView = new DataView(arr.buffer); parserConfig.propertyObserver(fields, propertyTag, arr); const typeConverter = typeConverters[propertyTag & 0xFFFF]; }

kenjiuno commented 3 years ago

Hi thanks for reporting. If you don't mind, could you send a sample msg file to ku@digitaldolphins.jp?

Please let me know if this code change works:

mynpmpackage commented 3 years ago

Thanks kenjiuno for your quick response and fix.

I will test and let you know. Sorry due to restricted zone reasons, I am not allowed/unable to send the email.

When I tried opening .msg in 7zip, then I was able to see all properties are being same as normal email and body content as below.

Trusted Address email
From
outlook
mynpmpackage commented 3 years ago

Tried alpha in CLI. Parser is now parsing the email as it does for other emails but not seeing the body content in the parsed email. will try for other emails. I think body is not getting extracted from the email when tag becomes zero?

mynpmpackage commented 3 years ago

I see the body comes as html in (0x10130102, PidTagHtml , PtypBinary) and not being read by the parser, Parser able to read the html content when I add as below

 NAME_MAPPING: {
        // email specific
        '0037': 'subject',
        '0c1a': 'senderName',
        '0c1f': 'senderEmail',
        '5d0a': 'creatorSMTPAddress',
        '5d0b': 'lastModifierSMTPAddress',
        '1000': 'body',
        '007d': 'headers',
        '1009': 'compressedRtf',
        '3ffa': 'lastModifierName',
        **'1013': 'bodyHtml',**
        // attachment specific

Is this the reason we get error in body?

With the alpha build, we confirm that the issue related to Range error is fixed. Thanks.

Could you please let me know when you are planning for official release?

kenjiuno commented 3 years ago

Is this the reason we get error in body?

Could you please let me know when you are planning for official release?

PidTagBodyHtml Canonical Property | Microsoft Docs

'1013': 'bodyHtml', can be included for its purpose. However I don't have any sample msg files having 1013 bodyHtml. Thus I cannot validate this.

Can you provide a sample msg file, or can you share the method how to create it?

mynpmpackage commented 3 years ago

Hi thanks for reporting. If you don't mind, could you send a sample msg file to ku@digitaldolphins.jp?

Please let me know if this code change works:

Hi Kenjiuno, I have emailed you with the sample email which we face issue with. Kindly help us on the fix.

Below is the structure when we tried opening using 7zip

image

kenjiuno commented 3 years ago

Thanks I have received mail from you having msg file.

In this time Range error is not tested yet.

I have added html and bodyHtml.

And then published:

Your sample contains 10130102. 0102 is PT_BINARY. Thus it needs to be decoded in some encodings like UTF-8.

e.g.

https://github.com/HiraokaHyperTools/msgreader/blob/ac31ea2df9aaf69eb5689bbbc1848edb2fed2637/cli.js#L167-L175

You can try this with cli.js:

H:\Proj\msgreader>node cli html -h
Usage: cli html [options] <msgFilePath>

Parse msg file and display 1013001f:bodyHtml or 10130102:html

Options:
  -e, --encoding <encoding>  The encoding type to decode binary html. (default: "utf8")
  -h, --help                 display help for command

Sample usage:

node cli html -e latin1 test\html.msg
mynpmpackage commented 3 years ago

Thanks for the alpha build,

I am not able to install this version using npm i @kenjiuno/msgreader@1.7.3-alpha.2 and also tried downloading the package to use and run cli but cli fails working and gives error like below.

image

image

image

kenjiuno commented 3 years ago

Ah sorry, cli.js needs devDependencies that is not included in npmjs package. Please git clone this repository.

git clone https://github.com/HiraokaHyperTools/msgreader.git
cd msgreader
yarn
node cli
mynpmpackage commented 3 years ago

I missed yarn install. I am able to now see the body html when doing yarn as you mentioned. Thanks. So parser works now to parse html content.

I am not able to install this alpha version using npm.

Is it planned to release officially today.

kenjiuno commented 3 years ago

OK I will proceed publish.

mynpmpackage commented 3 years ago

Thanks Kenjiuno.

kenjiuno commented 3 years ago

Just now published.

mynpmpackage commented 3 years ago

I am not able to get the published latest version 1.7.3, Only upto 1.7.3-alpha.1 is available when i tried npm install and yarn add both. Could you please help to get the latest.

image

kenjiuno commented 3 years ago

Please try:

If you have yarn.lock try yarn:

yarn add @kenjiuno/msgreader@1.7.3
or
yarn add @kenjiuno/msgreader@latest

If you have package-lock.json try npm:

npm install --save @kenjiuno/msgreader@1.7.3
or
npm install --save @kenjiuno/msgreader@latest
mynpmpackage commented 3 years ago

Please try:

If you have yarn.lock try yarn:

yarn add @kenjiuno/msgreader@1.7.3
or
yarn add @kenjiuno/msgreader@latest

If you have package-lock.json try npm:

npm install --save @kenjiuno/msgreader@1.7.3
or
npm install --save @kenjiuno/msgreader@latest

seems I have issues because of my local packages store. Will check and let you know

kenjiuno commented 3 years ago

Hmm. There may be cache problem or such. Try at first:

npm cache clean --force

https://stackoverflow.com/a/66428125

Alternatively installing from repository's tar.gz may help.

yarn add https://github.com/HiraokaHyperTools/msgreader/archive/refs/tags/v1.7.3.tar.gz
or
npm i https://github.com/HiraokaHyperTools/msgreader/archive/refs/tags/v1.7.3.tar.gz
kenjiuno commented 3 years ago

Ah sorry I have noticed that installing from repository's tar.gz won't help. Please do not do it.

mynpmpackage commented 3 years ago

I got the npm issue fixed and now able to see the latest version. I see the html property in the parsed result. Thanks for quickly fixing and publishing. Can Close the issue?

kenjiuno commented 3 years ago

OK thx, now closing!