Open missinglink opened 3 years ago
Quickly, because I need to run away and attend to some other things.
There is currently no stdin
reader that implements the go-reader interface. That would be useful for have for things like this.
Of the three examples, the second is nearly there. Specifically, it should be:
./bin/exportify -reader-uri fs:// /path/to/berlin.geojson
Reader URIs are basically just URI-based constructors with relevant instructions for how to read one or more paths defined separately from the constructor itself. This can be irritating at times but, in the end, is less irritating than all the issues raised by trying to pack multiple paths in to the constructor itself.
But! The example above won't work today. It should work (as in: it should be made to work) but it doesn't because the tool was originally written to mirror the CLI signature for the Python tool which assumes one or more WOF IDs rather than arbitrary GeoJSON features:
This code was probably written in between the time that the Python code was refactored to speak Python3 and was taught to assume multiple input sources, so the same rules should be applied here too.
The relevant bit being the expectation in main
of a list of IDs and calls to the wof_reader.LoadBytesFromID
method.
The actual exporting (and formatting) and writing part just assumes bytes and user-defined go-writer.Writer
instance:
FYI:
I considered wiring this (reading from STDIN) in to the file://
package and special-casing the -
path (to mean "STDIN") but decided to just be explicit about things.
Cool thanks!
I feel like stdin is just a file living at /dev/stdin
(optionally aliased as -
), so it should be accessed via the fs://
protocol/scheme rather than defining its own protocol.
This method reminds me of /vsistdin/
from ogr2ogr
and solves my problem anyway, thanks!
Can os.Open
actually open /dev/stdin
? I've never tried...
Yea you totally can, but you don't really need to open it since it's already opened for you:
These two methods are equivalent:
func stdin() {
bytes, err := ioutil.ReadAll(os.Stdin)
log.Println(err, string(bytes))
}
func open() {
f, _ := os.Open("/dev/stdin")
bytes, err := ioutil.ReadAll(f)
log.Println(err, string(bytes))
}
I usually do something like this which returns an *os.File
regardless of which method is used:
// OpenFile - open a file
func OpenFile(name string, shouldCreate bool) *os.File {
// stdin
if name == "-" || name == "/dev/stdin" {
return os.Stdin
}
if shouldCreate {
if _, err := os.Stat(name); os.IsNotExist(err) {
os.Create(name)
}
}
// var file, _ = os.Open(name)
file, err := os.OpenFile(name, os.O_RDWR, 0644)
if err != nil {
panic(err)
}
return file
}
Presumably the point of using os.Stdin
however is that it handles cross-platform issues for you.
It looks like that's handled at lower levels since os.Stdin
refers to /dev/stdin
.
https://cs.opensource.google/go/go/+/refs/tags/go1.17:src/os/file.go;l=65
I had a quick poke around in the windows adapter but it wasn't immediately clear how it handles this: https://cs.opensource.google/go/go/+/refs/tags/go1.17:src/os/file_windows.go
Agh so it looks like the Windows syscall
adapter handles these unix-isms so they "just work"
https://cs.opensource.google/go/go/+/master:src/syscall/syscall_windows.go;l=474?q=Stdin&ss=go%2Fgo:src%2Fsyscall%2F
I would favour just being explicit and boring about this and keeping STDIN-related stuff inside it's own package.
I suppose we could update go-reader/file.go
constructor to check for and honour an ?allow-stdin
query parameter and if present do something like:
if r.allow_stdin && uri == "/dev/stdin" {
sr, _ := reader.NewReader(ctx, "stdin://")
return rs.Read(ctx, "-")
}
And/or just create and store sr
in the constuctor.
Or just this:
https://github.com/whosonfirst/go-reader/compare/allow-stdin
Which would allow something like this:
$> cat README.md | ./bin/read -reader-uri file:///dev/stdin - | wc -l
348
Again, note the "-". The paths that a reader will consume are a separate issue from the reader's own constructor.
I don't really like this because it means we're in the business of guessing, and of hijacking, other people's definition of stdin
in filenames.
As of v0.0.17 you can export records that are read from STDIN
. For example:
$> cat /usr/local/data/sfomuseum-data-whosonfirst/data/102/527/513/102527513.geojson | ./bin/exportify -reader-uri stdin:// -writer-uri stdout:// | grep wof:name
"wof:name": "San Francisco International Airport",
The relevant code is:
Note however that the exportify
tool does not try to update or fix pre-2019 EDTF values so until those values are fixed (manually or in go-whosonfirst-export/properties/edtf.go) then this will not be unexpected:
$> cat /usr/local/data/whosonfirst-data-admin-us/data/102/061/079/102061079.geojson | ./bin/exportify -reader-uri stdin:// -writer-uri stdout://
2021/08/20 11:33:51 Failed to export from STDIN, Unrecognized EDTF string 'uuuu' (Invalid or unsupported EDTF string)
There is an open ticket for this:
Is it possible to
exportify
a single file?I would like to pass a single feature in, either via
stdin
or a filename and have the response written to eitherstdout
or a destination path.The unix-y
stdio
stream interface would be ideal for me because it would make it easier to includeexportify
in pipelines.I had a look at the command line options and it doesn't currently seem possible, or maybe I'm wrong?