broadinstitute / barclay

Command line argument parser and online documentation generation utilities for java command line programs.
BSD 3-Clause "New" or "Revised" License
9 stars 6 forks source link

Add support for java.io.Path #106

Open pshapiro4broad opened 7 years ago

pshapiro4broad commented 7 years ago

I'd like to use Path instead of File in new code (and migrate existing code) since we'd like to use the newer java.io.Files APIs in our code. Since Path is abstract it doesn't have a a constructor Path(String), I can't use it directly with @Argument.

It doesn't look like there's an extension point in the parser for a client of the API to add support for new types.

It would be nice if the parser supported Path directly, or if that's not feasible, support an extension API so that Path typed arguments could be parsed by a client of the API.

cmnbroad commented 7 years ago

@pshapiro4broad I sympathize with the problem, but Barclay needs to be able to instantiate and populate (non collection) argument objects. As you point out, Path is an interface with multiple implementations, and needs to be resolved against a file system, which is something the caller needs to do.

I'm open to suggestions, but any solution that requires Barclay to delegate file system resolution and instantiation responsibility back to the client seems more complicated than just requiring that the client use any argument type of it's own that has a string constructor.

Having said that, the Picard(s) and GATK all have this same problem...

vruano commented 7 years ago

Wouldn't

FileSystems.getFileSystem(URI.create(str)).provider().getPath(URI.create(str))

always return the right answer? Could these special cases be hard-wired for pivotal classes/interfaces like Path?

Btw the code snippet above might be overcomplicated .... I just came up with it doing some quick API browsing.

cmnbroad commented 7 years ago

@vruano In theory I think you should be able to do that, or something like it. But see what GATK currently has to do to resolve the cases we care about. At some point, we plan to use a custom, URI-like class in GATK for these arguments, but Picard is going to need something similar. @lbergelson Any thoughts on this ?

magicDGS commented 7 years ago

An idea may be to create an interface-based class to load custom objects from a String passed in the command line, with a simple implementation. A custom class can be passed to the argument parser to support objects without a String constructor. The default value would try to use a default constructor with Stringfor every class, and fail if there is none.

Custom implementations can override the method to provide custom loading for classes they care about. E.g., GATK can provide an override for java.nio.Path which uses IOUtils, and Picard can use Paths.get instead.

This can also provide a way to parse genome locations from the String directly in the command line without a String constructor, for example.

pshapiro4broad commented 7 years ago

I wrote this ticket under the assumption that Paths.get(new URI(s)) would work correctly with gcs and local file paths. ( @vruano this is the simpler form you were looking for)

Did we run into some issue where this usage wouldn't work? If so, perhaps this was a bug in the google SDK that's been fixed since then.

droazen commented 7 years ago

@pshapiro4broad If your goal is to make use of the NIO support in htsjdk for GCS, etc., then you should know that most of the existing APIs require you to pass in the URI as a String anyway (in particular, the Tribble API), and htsjdk will internally construct a Path object as needed. This is because not all protocols are currently supported via NIO -- http, for example, is handled outside of NIO currently.

pshapiro4broad commented 7 years ago

My goal is to migrate all code to use Path over File in general, not specific to our use of Path for GCS access. The direct motivation was a recent change that added another CLP to picard-private which uses File for its command line arguments only to convert it to Path when it's used.

vruano commented 7 years ago

What are the plans regarding adopting NIO fully?

Is there a open implementation of a "HttpFileSystemProvider" that we could use? Is strange that one would not be available by now even if it is read-only being such a common protocol.

lbergelson commented 7 years ago

We didn't find an HTTP provider when we last looked. Maybe there is one now. We could write one but it's a pretty low priority and no one really has time to do it.