epfromer / pst-extractor

Extract objects from MS Outlook/Exchange PST files
59 stars 18 forks source link

`fs` and browser support #4

Closed redchair123 closed 5 years ago

redchair123 commented 6 years ago

Now that #1 has been settled, there are no node-specific dependencies, so in theory the module can be used in the browser. The problem is the reliance on fs, but since you've neatly factored the whole thing into PSTFile it should be easy to support an fs-less workflow!

To do this, I would propose:

1) adding a second optional parameter to the constructor:

public constructor(fileName: string, data?: Uint8Array)

If data is passed, then pstFD should be set to -1 (instead of calling openSync) and data should be captured in a private field.

2) close should check if pstFD is positive:

    public close() {
        if(this.pstFD > 0) fs.closeSync(this.pstFD);
    }

3) the readSync calls should be replaced with Buffer#copy. For example, https://github.com/epfromer/pst-extractor/blob/master/src/PSTFile/PSTFile.class.ts#L802 becomes:

        const bytesRead = this.pstFD > 0 ? fs.readSync(this.pstFD, buffer, 0, buffer.length, pos) : this.data.copy(buffer, 0, pos, pos + buffer.length);
epfromer commented 6 years ago

I'm game, and will look into this over the coming days.

lukeburpee commented 6 years ago

Not to add needless dependencies, but here are a couple of packages that might be worth checking out for a browser implementation: https://www.npmjs.com/package/random-access-storage https://www.npmjs.com/package/browserfs https://www.npmjs.com/package/random-access-http https://www.npmjs.com/package/random-access-file-reader

epfromer commented 6 years ago

I got an hour this afternoon to look into this, and checked it into Github (not yet to NPM). Check out the changes at https://github.com/epfromer/pst-extractor/commit/6a1c1b21beecc2ce955556090afa01f8f960a200.

Instead of a second param on the constructor, I use instanceof for cheap overloading. It now looks like this:

public constructor(pstBuffer: Buffer);
public constructor(fileName: string);
public constructor(arg: any) {
    if (arg instanceof Buffer) {
        // use an in-memory buffer of PST
        this.pstBuffer = arg;
        this.pstFD = -1;
    } else {
        // use PST in filesystem 
        this._pstFilename = arg;
        this.pstFD = fs.openSync(this._pstFilename, 'r');
    }

I also added a new test script (test-in-mem.ts) which simply loads a PST into memory before passing it to the constructor of PSTFile.

As you might expect, the processing of the in-memory PST is WAY FASTER than from disk (even SSD).

Is this generally in the right direction? I'm still not sure this will work in a browser, as I'm using a bunch of Node.js objects.

ivanbreet commented 5 years ago

Hi, I'm trying to load a buffer into new PSTFile(someBuffer) but I'm getting Error: ENOENT: no such file or directory on the latest version (1.3.2).

ivanbreet commented 5 years ago

Hi, I'm trying to load a buffer into new PSTFile(someBuffer) but I'm getting Error: ENOENT: no such file or directory on the latest version (1.3.2).

After compiling it locally, it appears to work. It looks like there is something funny with the buffer constructor of the 1.3.2 version on NPM.

epfromer commented 5 years ago

Thanks, I will look into it.

ivanbreet commented 5 years ago

Any luck with publishing the latest changes?

epfromer commented 5 years ago

Sorry, I have been distracted with other projects. I'll get back to this, this week.

epfromer commented 5 years ago

I had to reconstruct my dev machine, and have done that, but I can't reproduce the issue. Can you provide more detail? I plan to update the dependencies and release a new version nonetheless. Has anyone else experienced this problem, and if so, can you please comment?

epfromer commented 5 years ago

@ivanbreet - this might be related to your issue. https://nodejs.org/en/docs/guides/buffer-constructor-deprecation/

I have ported the code to the new Buffer constructors and am running tests now. So far, so good.

epfromer commented 5 years ago

I've released a new 1.5.0 version. Please check it out.

ivanbreet commented 5 years ago

Thank you @epfromer. The Buffer release will really play nice with some AWS Lambda functions we are busy with.

I will have a look at the latest release and let you know.