omicronapps / 7-Zip-JBinding-4Android

Android Java wrapper for 7z archiver engine
GNU Lesser General Public License v2.1
121 stars 24 forks source link

Issue while extracting files to internal storage #8

Closed fiasko131 closed 3 years ago

fiasko131 commented 3 years ago

I am trying to use this library to extract RAR files on android. By taking your example to extract:

private ISimpleInArchiveItem mISimpleInArchiveItem;
private IInArchive mInArchive;
private String destPath;
...................................// execute in an asynctask
@Override
 protected Void doInBackground(Void... voids) {
       extract(archivePath);
 }
...............................
private void extract(String archivePath){
        try {
            RandomAccessFile randomAccessFile = new RandomAccessFile(new File(archivePath), "r");
            RandomAccessFileInStream inStream = new RandomAccessFileInStream(randomAccessFile);
            ArchiveOpenCallback callback = new ArchiveOpenCallback();
            mInArchive = SevenZip.openInArchive(null, inStream, callback);

            int itemCount = mInArchive.getNumberOfItems();
            SequentialOutStream outStream = new SequentialOutStream();
            for (int i = 0; i < itemCount; i++) {
                ISimpleInArchiveItem iSimpleInArchiveItem = mInArchive.getSimpleInterface().getArchiveItem(i);
                if (!iSimpleInArchiveItem.isFolder()){
                    mISimpleInArchiveItem = iSimpleInArchiveItem;
                    ExtractOperationResult result = mInArchive.extractSlow(i, outStream);

                    if (result != ExtractOperationResult.OK) {
                        Log.i("result", result.toString());
                    }
                }

            }

            mInArchive.close();
            inStream.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (SevenZipException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

private class SequentialOutStream implements ISequentialOutStream {
        @Override
        public int write(byte[] data) throws SevenZipException {
            if (data == null || data.length == 0) {
                throw new SevenZipException("null data");
            }
            if (!mISimpleInArchiveItem.isFolder()){
                InputStream inputStream = new BufferedInputStream(new ByteArrayInputStream(data));
                String extractedPath = destPath+"/"+mISimpleInArchiveItem.getPath();
                String parent = new File(extractedPath).getParent();
                String fileName = new File(extractedPath).getName();
                String ext = FileUtil.getExtension(new File(extractedPath));
                String baseName = FilenameUtils.getBaseName(fileName);
                Log.i("filename",fileName);
                Log.i("filename",ext);
                Log.i("filename",baseName);
                // check for duplicated extensions - for example filename.jpg.jpg
                if (baseName.contains("."+ext)){
                    DecompressAsyncTask.this.fileName = baseName;
                    extractedPath = parent+"/"+baseName;
                }else {
                    DecompressAsyncTask.this.fileName = fileName;
                    extractedPath = parent+"/"+fileName;
                }
                File folder = new File(parent);
                if (!folder.exists()){
                    folder.mkdirs();
                }
                try {

                    OutputStream outputStream = new FileOutputStream(new File(extractedPath));
                    int len;
                    byte buf[] = new byte[FileUtil.DEFAULT_BUFFER_SIZE];
                    while ((len = inputStream.read(buf)) != -1) {
                        outputStream.write(buf, 0, len);
                        totalBytesCount = totalBytesCount+ len;
                        publishProgress(totalBytesCount);

                    }
                    outputStream.close();
                    inputStream.close();
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }

            Log.i("result", "Data to write: " + data.length);
            return data.length;
        }
    }

private class ArchiveOpenCallback implements IArchiveOpenCallback {
        @Override
        public void setTotal(Long files, Long bytes) {
            Log.i("result", "Archive open, total work: " + files + " files, " + bytes + " bytes");
        }

        @Override
        public void setCompleted(Long files, Long bytes) {
            Log.i("result", "Archive open, completed: " + files + " files, " + bytes + " bytes");
        }
    }

The code seems to work at first, but sometimes and in an eratic way one or more files are not correctly extracted (incorrect size and therefore corrupt)

In addition, no error during extraction is returned.

Test carried out on android 29 (Q), with .rar and .7z files and whatever the ArchiveFormat selected: ArchiveFormat.RAR5, ArchiveFormat.SEVEN, null ....

I attach you the test files: https://drive.google.com/file/d/1tOZcwsJCTgD5zoEyWm8QvL1c8fM6wrS_/view?usp=sharing (rar) https://drive.google.com/file/d/12xr_1BlutVHF6sfK87BWTuMqrJ676p5M/view?usp=sharing (7z)

Is there something that I am doing wrong?

In the meantime, thank you for this great porting job for Android

fiasko131 commented 3 years ago

So it seems that going through a bufferedInpustream is not the right option ... why ??

Here is the solution that seems ok to me:

private void extract(String path, String destPath, String passWord){
        try {
            RandomAccessFile randomAccessFile = new RandomAccessFile(new File(path), "r");
            RandomAccessFileInStream inStream = new RandomAccessFileInStream(randomAccessFile);
            ArchiveOpenCallback callback = new ArchiveOpenCallback();
            IInArchive iInArchive = SevenZip.openInArchive(null, inStream, callback);

            int itemCount = iInArchive.getNumberOfItems();
            //SequentialOutStream outStream = new SequentialOutStream();
            for (int i = 0; i < itemCount; i++) {
                ISimpleInArchiveItem iSimpleInArchiveItem = iInArchive.getSimpleInterface().getArchiveItem(i);
                if (!iSimpleInArchiveItem.isFolder()){
                    String extractedPath = destPath+"/"+iSimpleInArchiveItem.getPath();
                    String parent = new File(extractedPath).getParent();
                    String fileName = new File(extractedPath).getName();
                    String ext = FileUtil.getExtension(new File(extractedPath));
                    String baseName = FilenameUtils.getBaseName(fileName);
                    Log.i("filename",fileName);
                    Log.i("filename",ext);
                    Log.i("filename",baseName);
                    if (baseName.contains("."+ext)){
                        this.fileName = baseName;
                        extractedPath = parent+"/"+baseName;
                    }else {
                        this.fileName = fileName;
                        extractedPath = parent+"/"+fileName;
                    }
                    File folder = new File(parent);
                    if (!folder.exists()){
                        folder.mkdirs();
                    }
                    OutputStream outputStream = new FileOutputStream(new File(extractedPath));
                    //mISimpleInArchiveItem = iSimpleInArchiveItem;
                    ExtractOperationResult result = iInArchive.extractSlow(i, data -> {
                        try {
                            outputStream.write(data);
                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                        return data.length;
                    },passWord);

                    if (result != ExtractOperationResult.OK) {
                        Log.i("result", result.toString());
                    }
                }
            }

            iInArchive.close();
            inStream.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (SevenZipException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

But the method header specifies: Extract one item from archive. Multiple calls of this method are inefficient for some archive types. Any idea why?

omicronapps commented 3 years ago

Please note that for larger files ISequentialOutStream.write() can be called multiple times for one file, like this:

I/filename: IMG_20120711_035403.jpg.jpg
I/result: Data to write: 947714
I/filename: IMG_20120711_035403.jpg.jpg
I/result: Data to write: 816523

For a total uncompressed size of 1,764,237 (947,714 + 816,523) bytes.

However, it looks your implementation creates a new file each time write() is called:

                    OutputStream outputStream = new FileOutputStream(new File(extractedPath));

When extracting an item from the archive with IInArchive.html.extractSlow() then ensure that previous file has completed before opening a new file.

fiasko131 commented 3 years ago

Thank you for the answer.

i tried with this implementation:

private void extract(String path){
        try {
            RandomAccessFile randomAccessFile = new RandomAccessFile(new File(path), "r");
            RandomAccessFileInStream inStream = new RandomAccessFileInStream(randomAccessFile);
            ArchiveOpenCallback callback = new ArchiveOpenCallback();
            IInArchive iInArchive = SevenZip.openInArchive(null, inStream, callback);
            ArchiveExtractCallback extractCallback = new ArchiveExtractCallback(iInArchive);
            iInArchive.extract(null, false, extractCallback);
        } catch (FileNotFoundException e) {

        } catch (SevenZipException e) {
        } catch (IOException e) {

        }
    }
private class ArchiveExtractCallback implements IArchiveExtractCallback {
        IInArchive iInArchive;
        File extractedFile;
        OutputStream outputStream;

        public ArchiveExtractCallback(IInArchive iInArchive){
            this.iInArchive = iInArchive;

        }
        @Override
        public ISequentialOutStream getStream(int index, ExtractAskMode extractAskMode) throws SevenZipException {
            Log.i("result", "Extract archive, get stream: " + index + " to: " + extractAskMode);
            //mISimpleInArchiveItem = mInArchive.getSimpleInterface().getArchiveItem(index);
            String path = iInArchive.getStringProperty(index, PropID.PATH);
            String folder = iInArchive.getStringProperty(index, PropID.IS_FOLDER);
            boolean isDir = false;
            extractedFile = new File(destPath,path);
            if (folder.equals("+")) isDir = true;
            if (isDir) {
                extractedFile.mkdirs();
            } else {
                try {
                    outputStream = new FileOutputStream(extractedFile);
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            Log.i("result",path);
            return data -> {
                try {
                    outputStream.write(data);
                } catch (IOException e) {
                    e.printStackTrace();
                }
                Log.i("result", "Data to write: " + data.length);
                return data.length;
            };
        }

        @Override
        public void prepareOperation(ExtractAskMode extractAskMode) throws SevenZipException {
            Log.i("result", "Extract archive, prepare to: " + extractAskMode);
        }

        @Override
        public void setOperationResult(ExtractOperationResult extractOperationResult) throws SevenZipException {
            Log.i("result", "Extract archive, completed with: " + extractOperationResult);
            if (extractOperationResult != ExtractOperationResult.OK) {
                extractedFile.delete();
                throw new SevenZipException(extractOperationResult.toString());

            }else {
                if (!extractedFile.isDirectory()){
                    try {
                        outputStream.close();
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
            }
        }

        @Override
        public void setTotal(long total) throws SevenZipException {
            Log.i("result", "Extract archive, work planned: " + total);
        }

        @Override
        public void setCompleted(long complete) throws SevenZipException {
            Log.i("result", "Extract archive, work completed: " + complete);
        }
    }

It works fine, but not with RAR5 archives, ExtractOperationResult give me : DATAERROR

omicronapps commented 3 years ago

Depending on the archive type, the directory may not be included as a separate item, rather it will be part of the archived item. You will need to detect this during extraction and create the directory in the file system before attempting to write the file to disk.

        public ISequentialOutStream getStream(int index, ExtractAskMode extractAskMode) throws SevenZipException {
...
            Integer attributes = (Integer) iInArchive.getProperty(index, PropID.ATTRIBUTES);
            boolean isDir = (attributes & PropID.AttributesBitMask.FILE_ATTRIBUTE_DIRECTORY) != 0;

            File dirFile = new File(path);
            String dir = dirFile.getParent();
            if (dir != null) {
                File extractedDir = new File(destPath, dir);
                boolean dirCreated = extractedDir.mkdir();
            }
...
fiasko131 commented 3 years ago

thank you, indeed the attributes are different between .zip, .7z, .tar and .rar. So is it better to go through this method systematically rather than extractSlow?

omicronapps commented 3 years ago

So is it better to go through this method systematically rather than extractSlow?

Yes, If extracting the whole archive then using IInArchive.extract() is generally more efficient.