Converting a SAM file to a BAM file was slower than expected. When looking at the profiler i noticed the following:
The SAMReader uses about 20% of its time in the File.length() method. While this method is basically "free" on a local filesystem, it is not when using a network drive.
This could easily be fixed by simply caching the size of the file in the constructor of htsjdk.samtools.seekablestream.SeekableFileStream.
Your environment:
version of htsjdk: 3.0.2
version of java: 19
which OS: Windows 10
Steps to reproduce
Put a SAM file on a network drive (in my case a Synology NAS with an SMB connection).
Read the file and profile it.
Expected behaviour
The code should not spend 20% of the time getting the length of the file.
Actual behaviour
The code asks the remote file system constantly how big the file is.
Description of the issue:
Converting a SAM file to a BAM file was slower than expected. When looking at the profiler i noticed the following:
The SAMReader uses about 20% of its time in the File.length() method. While this method is basically "free" on a local filesystem, it is not when using a network drive.
This could easily be fixed by simply caching the size of the file in the constructor of htsjdk.samtools.seekablestream.SeekableFileStream.
Your environment:
Steps to reproduce
Put a SAM file on a network drive (in my case a Synology NAS with an SMB connection). Read the file and profile it.
Expected behaviour
The code should not spend 20% of the time getting the length of the file.
Actual behaviour
The code asks the remote file system constantly how big the file is.