Open pshapiro4broad opened 7 years ago
@pshapiro4broad I sympathize with the problem, but Barclay needs to be able to instantiate and populate (non collection) argument objects. As you point out, Path is an interface with multiple implementations, and needs to be resolved against a file system, which is something the caller needs to do.
I'm open to suggestions, but any solution that requires Barclay to delegate file system resolution and instantiation responsibility back to the client seems more complicated than just requiring that the client use any argument type of it's own that has a string constructor.
Having said that, the Picard(s) and GATK all have this same problem...
Wouldn't
FileSystems.getFileSystem(URI.create(str)).provider().getPath(URI.create(str))
always return the right answer? Could these special cases be hard-wired for pivotal classes/interfaces like Path?
Btw the code snippet above might be overcomplicated .... I just came up with it doing some quick API browsing.
@vruano In theory I think you should be able to do that, or something like it. But see what GATK currently has to do to resolve the cases we care about. At some point, we plan to use a custom, URI-like class in GATK for these arguments, but Picard is going to need something similar. @lbergelson Any thoughts on this ?
An idea may be to create an interface-based class to load custom objects from a String
passed in the command line, with a simple implementation. A custom class can be passed to the argument parser to support objects without a String
constructor. The default value would try to use a default constructor with String
for every class, and fail if there is none.
Custom implementations can override the method to provide custom loading for classes they care about. E.g., GATK can provide an override for java.nio.Path
which uses IOUtils
, and Picard can use Paths.get
instead.
This can also provide a way to parse genome locations from the String
directly in the command line without a String
constructor, for example.
I wrote this ticket under the assumption that Paths.get(new URI(s))
would work correctly with gcs and local file paths. ( @vruano this is the simpler form you were looking for)
Did we run into some issue where this usage wouldn't work? If so, perhaps this was a bug in the google SDK that's been fixed since then.
@pshapiro4broad If your goal is to make use of the NIO support in htsjdk for GCS, etc., then you should know that most of the existing APIs require you to pass in the URI as a String anyway (in particular, the Tribble API), and htsjdk will internally construct a Path object as needed. This is because not all protocols are currently supported via NIO -- http, for example, is handled outside of NIO currently.
My goal is to migrate all code to use Path
over File
in general, not specific to our use of Path
for GCS access. The direct motivation was a recent change that added another CLP to picard-private which uses File
for its command line arguments only to convert it to Path
when it's used.
What are the plans regarding adopting NIO fully?
Is there a open implementation of a "HttpFileSystemProvider" that we could use? Is strange that one would not be available by now even if it is read-only being such a common protocol.
We didn't find an HTTP provider when we last looked. Maybe there is one now. We could write one but it's a pretty low priority and no one really has time to do it.
I'd like to use
Path
instead ofFile
in new code (and migrate existing code) since we'd like to use the newerjava.io.Files
APIs in our code. SincePath
is abstract it doesn't have a a constructorPath(String)
, I can't use it directly with@Argument
.It doesn't look like there's an extension point in the parser for a client of the API to add support for new types.
It would be nice if the parser supported
Path
directly, or if that's not feasible, support an extension API so thatPath
typed arguments could be parsed by a client of the API.