JetBrains-Research / big

BigWIG, BigBED and TDF for the JVM
MIT License
13 stars 2 forks source link
bigbed bigwig jvm kotlin tdf

JetBrains Research Linux/MacOS Build status Windows Build status

big

big implements high performance classes for reading and writing BigWIG, BigBED and TDF. You can use big in any programming language running on the JVM, but the public API is in part Kotlin-specific.

Installation

The latest version of big is available on Maven Central. If you're using Gradle just add the following to your build.gradle:

repositories {
    mavenCentral()
}

dependencies {
    compile 'org.jetbrains.bio:big:0.9.1'
}

With Maven, specify the following in your pom.xml:

<dependency>
  <groupId>org.jetbrains.bio</groupId>
  <artifactId>big</artifactId>
  <version>0.9.1</version>
</dependency>

The previous versions were published on Bintray. They can be downloaded from GitHub Releases.

Examples

The following examples assume that all required symbols are imported into the current scope. They also rely on the helper function for reading TSV formated chromosome sizes from UCSC annotations.

/** Fetches chromosome sizes from a UCSC provided TSV file. */
internal fun Path.chromosomes(): List<Pair<String, Int>> {
    return Files.newBufferedReader(this).lineSequence().map { line ->
        val chunks = line.split('\t', limit = 3)
        chunks[0] to chunks[1].toInt()
    }.toList()
}

wigToBigWig

fun wigToBigWig(inputPath: Path, outputPath: Path, chromSizesPath: Path) {
    BigWigFile.write(WigFile(inputPath), chromSizesPath.chromosomes(), outputPath)
}

bigWigSummary

fun bigWigSummary(inputPath: Path, numBins: Int) {
    BigWigFile.read(inputPath).use { bwf ->
        println("Total: ${bwf.totalSummary}")

        for (chromosome in bwf.chromosomes.valueCollection()) {
            for ((i, summary) in bwf.summarize(chromosome, numBins = numBins).withIndex()) {
                println("bin #${i + 1}: $summary")
            }
        }
    }
}

bedToBigBed

fun bedToBigBed(inputPath: Path, outputPath: Path, chromSizesPath: Path) {
    BigBedFile.write(BedFile(inputPath), chromSizesPath.chromosomes(), outputPath)
}

bigBedToBed

fun bigBedToBed(inputPath: Path) {
    BigBedFile.read(inputPath).use { bbf ->
        for (chromosome in bbf.chromosomes.valueCollection()) {
            for ((chrom, start, end) in bbf.query(chromosome)) {
                // See 'BedEntry' for a complete list of available
                // attributes.
                println("$chrom\t$start\t$end")
            }
        }
    }
}

Building from source

The build process is as simple as

$ ./gradlew jar

Note: don't use ./gradlew assemble, since it includes the signing of the artifacts and will fail if the correct credentials are not provided.

Testing

No extra configuration is required for running the tests from Gradle

$ ./gradlew test

Publishing

You can publish a new release with a one-liner

./gradlew clean assemble test generatePomFileForMavenJavaPublication bintrayUpload

Make sure to set Bintray credentials (see API key section here) in $HOME/.gradle/gradle.properties.

$ cat $HOME/.gradle/gradle.properties
bintrayUser=CHANGEME
bintrayKey=CHANGEME

Useful links