ropensci / unconf16

rOpenSci's San Francisco hackathon/unconf 2016
http://unconf16.ropensci.org
23 stars 7 forks source link

OpenAPI Specification: generate the R client from an API spec? #18

Open jennybc opened 8 years ago

jennybc commented 8 years ago

Why are most R packages that wrap APIs such unique and special snowflakes? ❄️

For discussion

gaborcsardi commented 8 years ago

++ :100:

sckott commented 8 years ago

see also https://github.com/ropenscilabs/apipkgen

jennybc commented 8 years ago

To flesh this out a bit ...

@gaborcsardi's gh wrapper for the GitHub API is an interesting data point. It's basically the thinnest possibly wrapper and to use it, you have read the API docs. Which I personally don't mind. It feels like I usually have to do this anyway! And at least this way you don't have to learn the UI that the package wrapper designed to put their own personal touch on it.

So what if, when you go to wrap a new API, you spend your time documenting it according to some spec. That would be a useful contribution on its own. Then you run a code generator on it and do the bare minimum of customization. Use the time you saved to write an amazing vignette. This seems much more efficient for package development and would create much less variety of interface for users. This would also increase the number of API wrapping packages that are well-documented and tested, as presumably that would be part of the framework.

How could you write any custom bits so that re-running the generator, maybe to update against a new version of the API, doesn't clobber them? Or signals where boilerplate and custom collide and require reconciliation? Or maybe there's no room in this paradigm for hand-crafted bits in the client package?

cboettig commented 8 years ago

Great stuff!

For SOAP-based APIs, my understanding is that this was pretty commonly done, by specifying a WSDL file with the API.

For instance, Duncan's Rflickr R package was built automatically in just the way you describe by parsing the WSDL automatically to make most of the functions, and then fine-tuning a bit. There's the REST equivalent of WSDL, called WADL , which seems to have never really taken off (guess it smelled too much like SOAP). From my very limited perspective it seems like the world got bored with the very heavy approach. In some sense, the logical extreme perspective of REST is antithetical to the idea of needing to write a language-specific wrapper, the REST interface was supposed to be simple and clean enough to use directly with a general-purpose library like httr. (though I think our experience shows differently for the most part, but maybe a valid question).

To me the special snowflake discussion in web APIs seems to parallel the special snowflake discussion in data formatting. For a while it seemed like we had lots of data formats being defined very precisely by schema files, which play much the same role as these standardized descriptions of the API (essentially schemas for API itself). Then all of that seemed heavy and slow and overly complicated so lots of stuff just went to schema-less JSON where code development could be nice and fast and not worry about schemas and standards. There are certainly advantages to both approaches and no doubt depends on the context. (Not to sound old-crotchety because I really know nothing about any of this, but it does feel like this agile JSON/REST world is now re-inventing schemas and standards for API descriptions; though no doubt with refinements and perhaps that's how progress works).

Anyway, just saying that I think this issue raises some very interesting (at least to me) conceptual questions about simplicity and flexibility vs standards and generality.

leeper commented 8 years ago

And just a note: SSOAP, in theory, develops S4-based clients from a WSDL document.

sckott commented 8 years ago

@leeper right, but SSOAP isn't working with R >= 3.2 - does it work for you in that version or greater?

one promising route is http://jsonapi.org/ && started fiddling here https://github.com/sckott/rjsonapi - but I think there's little uptake of JSONAPI so far

cboettig commented 8 years ago

@leeper well not just in theory, you can see Rflickr and I think other examples on omegahat.net of packages that were made with SSOAP and worked at least at one point. But as Scott points out, maybe this still is just 'in theory' since like many omegahat projects some of this is more proof of principle than a robust and current solution. (Duncan also has a WADL package, but not sure if anyone has encountered WADL documentation for a REST API?) Even with earlier versions of R I think Scott & I have had some troubles realizing the magic of the automated approach.

I've had the same problem with XMLSchema, which is supposed to generate S4 classes automatically from schema files.

For me, the fundamental question here is: are these examples (SSOAP, XMLSchema) like other Omegahat projects that really are brilliant and important ideas that just need some work to be more robust, practical and user friendly (like, say, how RCurl and XML were fundamental but have become rather more practical in their newer incarnations...) or are there more fundamental issues at play here that will doom things like JSONAPI & other programmatic generation of wrappers to the same fate? I've no idea, maybe a bit of both, but I'd love to hear others thoughts on that!

karthik commented 8 years ago

the REST interface was supposed to be simple and clean enough to use directly with a general-purpose library like httr. (though I think our experience shows differently for the most part, but maybe a valid question).

Agreed - a great idea that never came to fruition. I would love it if that were the case, but all the little customizations we have to do just accumulate over time making it necessary to have so many snowflake packages.

wing328 commented 8 years ago

If anyone would like to contribute a R generator to Swagger-Codegen, here is a good starting point: https://github.com/swagger-api/swagger-codegen/issues/2231

For those who are not familiar with Java, another approach is to implement a R API client to access just one endpoint: https://github.com/swagger-api/swagger-codegen/blob/master/modules/swagger-codegen/src/test/resources/2_0/petstore.json#L406, and then the Swagger Codegen core reverse-engineer the generator and template for R.

If you've any questions about Swagger Codegen, please let me know.

Disclosure: I'm a top contributor to Swagger Codegen.

hadley commented 8 years ago

Improving existing API guidelines at https://usecanvas.com/anonymous/best-practices-for-writing-an-api-package/2YxE9uDvd4OgDzvjMutlHI

leeper commented 8 years ago

@hadley There's a reasonably coherent/complete update to the document on canvas. Probably needs some editing.

hadley commented 8 years ago

@leeper I filed an issue at https://github.com/hadley/httr/issues/350 to remind me to integrate (about an hour ago)

bergant commented 8 years ago

I needed a quick and dirty solution for calling APIs that change frequently but are specified in OpenAPI (Swagger) format. Dynamically creating a list of functions that use httr and jsonlite seemed like a good idea. See https://github.com/bergant/rapiclient.

jennybc commented 8 years ago

@leeper just pointed this out too

https://github.com/hrbrmstr/swagger

hrbrmstr commented 8 years ago

That pkg needs work :-) I started it as a favour for a cyber friend but I surveyed the landscape of sites offering Swagger/OpenAPI specs and it was a pretty bleak picture. We're abt to do something at work-work API-wise that may give me intraday official hours to kickstart this back up, though. Lemme poke at the latest OpenAPI spec. I was also contemplating writing a Paw plugin for R code generation.