gbif / dwca-io

Darwin Core Archive IO
Apache License 2.0
7 stars 9 forks source link
darwin-core

Darwin Core Archive I/O (dwca-io)

Formerly known as dwca-reader

The dwca-io library provides:

To build the project

Note: this project requires Java 8.

mvn clean install

Usage

Reading a simple Darwin Core Archive

Read an archive and display data from the core record:

Path myArchiveFile = Paths.get("myArchive.zip");
Path extractToFolder = Paths.get("/tmp/myarchive");
Archive dwcArchive = DwcFiles.fromCompressed(myArchiveFile, extractToFolder);

// Loop over core records and display id, genus, specific epithet
for (Record rec : dwcArchive.getCore()) {
  System.out.printf("%s: %s %s%n", rec.id(), rec.value(DwcTerm.genus), rec.value(DwcTerm.specificEpithet));
}

Reading DarwinCore archive + extensions

Read from a folder (extracted archive) and display data from the core and the extension:

Path myArchiveFile = Paths.get("myArchive.zip");
Path extractToFolder = Paths.get("/tmp/myarchive");
Archive dwcArchive = DwcFiles.fromCompressed(myArchiveFile, extractToFolder);

System.out.println("Archive rowtype: " + dwcArchive.getCore().getRowType() + ", "
    + dwcArchive.getExtensions().size() + " extension(s)");

// Loop over star records and display id, core record data, and extension data
for (StarRecord rec : dwcArchive) {
  System.out.printf("%s: %s %s%n", rec.core().id(), rec.core().value(DwcTerm.genus), rec.core().value(DwcTerm.specificEpithet));
  if (rec.hasExtension(DwcTerm.Occurrence)) {
    for (Record extRec : rec.extension(DwcTerm.Occurrence)) {
      System.out.println(" - " + extRec.value(DwcTerm.country));
    }
  }
}

Other supported file types

The DwcFiles.fromLocation method also supports the following file types:

Notes

Maven

Ensure you have the GBIF repository in your pom.xml

<repositories>
  <repository>
    <id>gbif-repository</id>
    <url>https://repository.gbif.org/content/groups/gbif</url>
  </repository>
</repositories>

Add the dwca-io artifact

  <dependency>
    <groupId>org.gbif</groupId>
    <artifactId>dwca-io</artifactId>
    <version>{latest-version}</version>
  </dependency>

where {latest-version} can be found here

Change Log

Change Log

Documentation

JavaDocs

Unsupported archives

Darwin Core Text specifies several features which are not supported by this library.

These features are very rarely used, and will not be implemented without good reason.