rjohnsondev / java-libpst

A library to read PST files with java, without need for external libraries.
249 stars 122 forks source link

PST files 4gb or more - exception in PSTFile #97

Open ghost opened 2 years ago

ghost commented 2 years ago

I've taken the latest code from the develop branch, for java-libpst. It works fine for PST files less than 4gb in size. For PST files 4gb or more, i get an exception in PSTFile.java, when it's trying to read the PST file

com.pff.PSTException: Unable to find node: 5006 is desc: false
    at com.pff.PSTFile.findBtreeItem(PSTFile.java:818)
    at com.pff.PSTFile.getOffsetIndexNode(PSTFile.java:842)
    at com.pff.PSTFile.readLeaf(PSTFile.java:567)
    at com.pff.PSTFile.getPSTDescriptorItems(PSTFile.java:850)
    at com.pff.PSTFile.processNameToIdMap(PSTFile.java:265)
    at com.pff.PSTFile.<init>(PSTFile.java:197)
    at com.pff.PSTFile.<init>(PSTFile.java:135)
    at com.pff.PSTFile.<init>(PSTFile.java:123)
    at com.aptp.utif.impl.MigrationTool.processReadMailsBatch(MigrationTool.java:797)

The invokation code is -

PSTFile pstFile = new PSTFile( MigrationTool.EFS_SHARE + "/" +contents[i] );

The pst file was created using Outlook 365

Please can you help with this. Thanks alot

mooijtech commented 2 years ago

Hello serajdarak,

Can you try the pstreader library? https://github.com/Jmcleodfoss/pstreader There is an explorer available here: https://github.com/Jmcleodfoss/pstreader/blob/master/explorer/README.md The JAR can be downloaded here: https://repo1.maven.org/maven2/io/github/jmcleodfoss/explorer/1.1.2/ Or you can use the library directly: https://github.com/Jmcleodfoss/pstreader/tree/master/pst

Just to be sure your PST file isn't corrupted.

It seems like there is a problem when processing the Name to ID map.

Kind regards, Marten Mooij

ghost commented 2 years ago

Thanks for the advice

I tried the pstreader explorer, as you suggested, to check the PST file. First, i generated a 2GB file, using outlook client. I ran the pstreader and it opened the file and showed the details [successful] Next, i generated a 4GB file using the same outlook client. I ran the pstreader and it could not open the file, giving an error message. I've taken a screenshot - it mentions a max-value for the size of the PST file

image

mooijtech commented 2 years ago

From James McLeod:

"That’s a limitation of the Java nio standard library (the file position is given by a 32-bit integer). I have never stress tested my library with a large file so didn’t notice this limitation. I will have to think about whether and how to rearchitect it to get around this problem."

Most likely java-libpst has the same problem (not sure).

I guess the solution in the meantime is to split up your PST files.

rjohnsondev commented 2 years ago

I believe file positions in java nio are [long](https://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#position()), so the 32 bit limitation would be specific to that library. It is likely that PST originally only used 32 bit references for file positions internally so an int was all that was required.

It seems logical that the file location encoding would change for files over 4gb to accommodate. If you are able to generate a PST file over 4gb that can be shared I'll try to allocate time to dig into it

mooijtech commented 2 years ago

Hello Richard,

I have forwarded your message to James McLeod.

There is 50GB worth of PST files also ranging from 4GB-11GB from Hacking Team available via this torrent magnet link (see the folders mail, mail2, mail3):

magnet:?xt=urn:btih:51603bff88e0a1b3bad3962614978929c9d26955&dn=Hacked%20Team&tr=udp%3A%2F%2Fcoppersurfer.tk%3A6969%2Fannounce&tr=udp%3A%2F%2F9.rarbg.me%3A2710%2Fannounce&tr=http%3A%2F%2Fmgtracker.org%3A2710%2Fannounce&tr=http%3A%2F%2Fbt.careland.com.cn%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.demonii.com%3A1337&tr=udp%3A%2F%2Fexodus.desync.com%3A6969&tr=udp%3A%2F%2Ftracker.leechers-paradise.org%3A6969&tr=udp%3A%2F%2Ftracker.pomf.se&tr=udp%3A%2F%2Ftracker.blackunicorn.xyz%3A6969

Kind regards, Marten Mooij

mooijtech commented 2 years ago

The latest version of pstreader now supports larger PST files (over 4GB). I have also published my own library for reading PST files in Golang: https://github.com/mooijtech/go-pst If you try with pstreader and go-pst and give me an error or let me know if it works fine we can debug this issue further.