dragon66 / icafe

Java library for reading, writing, converting and manipulating images and metadata
Eclipse Public License 1.0
204 stars 58 forks source link

Performance problems with usage of icafe #26

Closed dragon66 closed 8 years ago

dragon66 commented 8 years ago

Original question from Prashant Bandewar

Jan 22 at 4:18 AM

I am using the icafe 1.1 version for one of my projects. The requirement is to split a multi-page tiff file and based on a criteria split it either into single or two page tif files. The splitimages method in TiffTweaker was not useful as it will split only into single pages so I modified the TiffTweaker and gave a custom implementation for it. I created the below method to call the TiffTweaker methods.

This works fine in case of smaller files but as the size of the source tiff file grows, the performance slows down considerably. Based on the analysis so far, it seems like the FileCacheRandomAccessInputStream takes most of the processing time and while reading it is always starting from the start of the source file and creating a cache file everytime from the start of source.

For e.g. if I have 10 page source file, and it needs to be split in 5 files of two page each, during iteration 1, cache file is created with first two pages(based on the list of IFD's which contains data only for the pages to be copied either one or two objects in list) but next time, the cache is created with 4 pages and next time with 6 pages. Because of this when the input file is large, the processing slows down considerably once the number of records cross a certain threshold based system configuration.

Any help from your side would be greatly appreciated.

java public static void createCMoDTiff(FileInputStream fin, FileOutputStream fout, List list) throws IOException { long startTiffSplit=System.currentTimeMillis(); RandomAccessInputStream rin = new FileCacheRandomAccessInputStream(fin,102400); RandomAccessOutputStream rout = new FileCacheRandomAccessOutputStream(fout,102400); // Copy the header information as is copyHeader(rin, rout); // copy the pages described in IFDs from inputfile to output file copyPages(list, TIFFWriter.FIRST_WRITE_OFFSET, rin, rout); int firstIFDOffset = list.get(0).getStartOffset(); // correct the pointer to 1st page writeToStream(rout, firstIFDOffset); rin.close(); rout.close(); long endTiffSplit = System.currentTimeMillis(); LOGGER.info("Time taken in ViaTiffTweaker : "+(endTiffSplit-startTiffSplit)); }

dragon66 commented 8 years ago

You only need to create RandomAccessInputStream one time and read in a list of IFDs. Keep it open until you finish all the split. You do need to create a RandomAccessOutputStream for each split file with either one or two pages.

This is the revised version I tried and it works.

public static void createCMoDTiff(FileInputStream fin, int pagesPerSplit) throws IOException {
        long startTiffSplit=System.currentTimeMillis();
        RandomAccessInputStream rin = new FileCacheRandomAccessInputStream(fin,102400);
        List<IFD> list = new ArrayList<IFD>();
        short endian = rin.readShort();
        rin.seek(STREAM_HEAD);
        int offset = readHeader(rin);
        readIFDs(null, null, TiffTag.class, list, offset, rin);
        int numberOfFiles = list.size()/pagesPerSplit;
        FileOutputStream fout = null;
        RandomAccessOutputStream rout = null;
        int count = 0;
        for(int i = 0; i < numberOfFiles; i+=pagesPerSplit) {
            fout = new FileOutputStream("split_"+count++ + ".tif");
            rout = new FileCacheRandomAccessOutputStream(fout,102400);
            writeHeader(endian, rout);
            // copy the pages described in IFDs from input file to output file
            copyPages(list.subList(i, i + pagesPerSplit), TIFFWriter.FIRST_WRITE_OFFSET, rin, rout);
            int firstIFDOffset = list.get(i).getStartOffset();
            // correct the pointer to 1st page
            writeToStream(rout, firstIFDOffset);          
            rout.close();
        }
        // You may have lost one page depending on whether or not the original TIFF contains even or odd pages
        int leftOver = list.size()%pagesPerSplit;
        if(leftOver > 0) ;// Remember to out put the left over one or more pages as a final TIFF
        rin.close();
        long endTiffSplit = System.currentTimeMillis();
        LOGGER.info("Time taken in ViaTiffTweaker : "+(endTiffSplit-startTiffSplit));
    }