Closed jgeewax closed 8 years ago
/cc @mpmcdonald
Maybe we can export stuff via http://sphinx-doc.org/builders.html#sphinx.builders.html.JSONHTMLBuilder ?
I'm open to it but don't want to write our own custom docs builder. I think the readthedocs.org build script creates the JSON for something (not sure what it's used for).
I think you have to define "docs builder". My suggestion is:
I was just making a general statement that we shouldn't put too much work in, since the payoff wouldn't be that large (and since we already have docs).
I think I hear what you're saying -- I'm trying to say "the branded site matters a lot", so unless we're talking about a 6-month, 2-person full-time project... I think we should definitely explore.
I think the effort to move gcloud-nodes site to gcloud-common would be pretty minimal, imo the biggest part would be coming up with a common format for the JSON.
That would be awesome! :+1:
UPDATE: From the latest build, it looks like readthedocs.org builds JSON via
sphinx-build -T -b json -d _build/doctrees-json -D language=en . _build/json
I gave that a try, looks like it spits out some HTML formatted stuff, not a structured representation of the parse tree.
Adding a custom builder/writer seems to spit out something more useful: https://gist.github.com/jgeewax/c254f8b4d9f48162eaad
Now we just need to massage it to either look like the gcloud-node JSON (https://github.com/GoogleCloudPlatform/gcloud-node/blob/gh-pages/json/master/datastore/query.json) or massage them both into a common shared format....
@stephenplusplus , thoughts?
Not sure if this will make anyone happy but sphinx
has JS support:
http://ericholscher.com/blog/2014/feb/11/sphinx-isnt-just-for-python/
http://sphinx-doc.org/domains.html#the-javascript-domain
I think finding a common format would be nice. As it is now, the JSON we get has a lot of extra junk in it we don't need. We'll just need to decide what those items are, then post-process our JSON build process.
The node docs site is very closely linked to our library's directory structure, which is also our class hierarchy...
Module -> Sub-module (ex: Storage -> File, Datastore -> Query)
The site's routing is hooked to this tree, as that's also how the JSON is stored: https://github.com/GoogleCloudPlatform/gcloud-node/tree/gh-pages/json/v0.10.0
I'm not sure it's possible to have all languages spit out files by the same structure (docs/json/module/submodule.json) -- if it is, that's a relief. If not, we'll have to figure that out.
One extra thing to note is that in Python for example, we have docs that don't correspond directly to a module (ie, https://googlecloudplatform.github.io/gcloud-python/stable/pubsub-usage.html). Do we have anything comparable on gcloud-node?
I don't think there's a huge issue with the module -> submodule/class structure. We already cram this in a bit with Python based on the left-nav, ie:
/cc @blowmage @quartzmo -- can you guys chime in about Ruby at all ?
@stephenplusplus : Is there a way we can start stripping down the dox output (maybe pipe it through a script) and come up with the minimal JSON that still works with the angular app?
We can get the ruby code to output JSON, but it will take some work. A Sphinx domain exists for Ruby, but it most likely also needs additional work.
We've put a good deal of work into the current docs and continue to make improvements like adding method categories. My preference is to stick with what we have since we know it works. But that preference is fairly weakly held. If you want us to switch we certainly can.
I'm not familiar with anything like dox for generating intermediate JSON for API docs in Ruby. But I do think it is a great approach and would love to make it happen.
@blowmage : The issue isn't that you've not done a good job (everyone has done a pretty solid job here). The concern is that gcloud-node has been where I file doc bugs first, so they tend to make style or functionality improvements first. Then we have to carry these over to the other projects.
It'd be really nice if we could just agree on a standard format for the content, and then use a single app to present that data. I'm working on getting Sphinx to spit out JSON that looks "good enough" to test whether this is possible.
I must've come across in a way I didn't intend. I didn't mean to come across as either resistant or feeling unappreciated. I was only trying to demonstrate them we have some momentum with our current approach.
But like I said earlier, my preference is weakly held. I'm happy to move forward with whatever is decided.
OK, I have a short script that uses pdoc
and parinx
to generate JSON files for docs... Here's some output on (slightly modified) gcloud.storage.client module: https://gist.github.com/jgeewax/c0041d8cbd24d21845f9
@stephenplusplus : Is this something we could possibly morph into an input to what you use for the gcloud-node docs?
I think it's close. It looks like most of the data is there, but it would be more easily adaptable if the format was tag-oriented like: https://github.com/GoogleCloudPlatform/gcloud-node/blob/4f8a574216195929db81977d6525d0e3af58e19f/json/master/storage/index.json#L40-L81
Here's what the converted example would look like:
{
"tags": [
{
"type": "param",
"name": "bucket_name",
"description": "The bucket name to create.",
"optional": true
},
{
"type": "return",
"types": [
"class:`gcloud.storage.bucket.Bucket`"
],
"description": "The newly created bucket."
},
{
"type": "example",
"string": "bucket = client.create_bucket('my-bucket')\nprint bucket\n<Bucket: my-bucket>"
}
],
"description": {
"full": "Create a new bucket. This implements \"storage.buckets.insert\". If the bucket already exists, will raise :class:`gcloud.exceptions.Conflict`."
}
}
The docs JS loops over those tags and parses them based on the logic here: https://github.com/GoogleCloudPlatform/gcloud-node/blob/dc4fed8ed2485f28bd5ad91c8ac81ab810bb8e0c/docs/site/components/docs/docs.js#L152-L188
We can choose to make things optional and just not render what's missing from the JSON input. For example:
var obj = {/* the demo above */};
obj.codeStart
)obj.tags[].type = 'resource'
) (this is how we call out relevant, resource links like the upstream JSON api link)There's a bit of concern with inter-doc linking. JSDocs has us format our references to our internal classes as: {module:storage/bucket#getMetadata}
for example. Looking at your example, that reference looks more like: class:gcloud.storage.bucket.Bucket
.
When we parse the "{module:storage/bucket#getMetadata}", it unfolds nicely to a link: "/storage/bucket/?method=getMetadata".
With: "gcloud.storage.bucket.Bucket"... "/storage/bucket/Bucket"? It should probably just be "/storage/bucket", but how can we reliably parse it?
One page of our docs currently maps to a Class. Is that the same for the other languages? If so, can we add in an obj.isConstructor
property so we know method to display first on the page?
Here's what the converted example would look like:
Where in that example do we specify the name of method itself...?
"lineNum/lineNumLink" (created from raw value: obj.codeStart)
I can work on getting the line number as well.
"docs" (created from raw value: obj.tags[].type = 'resource') (this is how we call out relevant, resource links like the upstream JSON api link)
Not sure what this is... ? Can you give me more info ?
With: "gcloud.storage.bucket.Bucket"... "/storage/bucket/Bucket"? It should probably just be "/storage/bucket", but how can we reliably parse it?
In our project (AFAIK) we follow the pattern of gcloud.storage.bucket.Bucket
-> gcloud/storage/bucket.py
containing class Bucket
. But should this really matter? Can we maybe tweak things on your end so that it's all one big file, keyed with the absolute path ? Then the file structure doesn't matter...?
The path you see is definitely an absolute unique path, but Python doesn't tie the unique path to a resource to a unique path in the file system.
One page of our docs currently maps to a Class. Is that the same for the other languages? If so, can we add in an obj.isConstructor property so we know method to display first on the page?
In Python, constructor's are __init__
. So I think the answer to your question is "yes, we can determine reliably if an item is a constructor".
name of the method
I missed that part. For us, it's:
{
"ctx": {
"type": "method",
"constructor": "Storage",
"cons": "Storage",
"name": "createBucket",
"string": "Storage.prototype.createBucket()"
}
}
All we really need from that is "name".
more info ?
https://github.com/GoogleCloudPlatform/gcloud-node/blob/master/lib/compute/firewall.js#L38 - https://github.com/GoogleCloudPlatform/gcloud-node/pull/790
In our project (AFAIK) we follow the pattern of gcloud.storage.bucket.Bucket -> gcloud/storage/bucket.py containing class Bucket. But should this really matter? Can we maybe tweak things on your end so that it's all one big file, keyed with the absolute path ? Then the file structure doesn't matter...?
We will all need to follow the routing as it exists now on gcloud-node:
I'm sure there's a solution where we can all define some type of custom routing, but if we don't have to, then :+1:.
We can definitely filter and combine our Dox JSON into one file of the essentials. And maybe key it by the same router hierarchy?
{
"storage": {
"description": "...",
"example": "...",
"methods": {
// ...
}
},
"storage/file": {
"description": "...",
"example": "...",
"methods": {
// ...
}
}
}
I'm still not sure how our JS will recognize custom types when wanting to link a returns or argument to the full Class page. With node, we have avoided adding anything custom to our JSDoc markup (except for https://github.com/GoogleCloudPlatform/gcloud-node/pull/790). So we name our modules and refer to them by the JSDoc rules: "{module:storage/file}". Each of our languages will have their different standards here, so off hand, I'm not sure the best way to handle this.
We will all need to follow the routing as it exists now on gcloud-node:
It sounds to me like there might be a way to structure things so that we don't need to have a special path structure, and can all share one big JSON document. Not saying we must but... it's something I don't want to write off yet.
- /service -> Constructor and methods for base class (e.g. Storage)
- /service/sub-class -> Constructor and methods for (e.g. Storage/File)
- /service/sub-class?method=methodName -> Single method on sub-class (e.g. Storage/File#getFiles)
This will probably work for Ruby, but Python doesn't create a Storage
class, it creates a storage.client.Client
class. So for us the hierarchy is really more like....:
gcloud.storage
which lines up with the service)gcloud.storage.bucket.Bucket
)We can definitely filter and combine our Dox JSON into one file of the essentials. And maybe key it by the same router hierarchy?
I'm a big fan of that. For us, we'll need everything each step of the way because gcloud.storage
, gcloud.storage.bucket
, and gcloud.storage.bucket.Bucket
are all things that can hold descriptions, examples, and methods.
{
"gcloud": { /* Python module (ie, gcloud, gcloud.credentials, gcloud.storage, etc) */
"description": "...",
"type": "module",
"example": "...",
"methods": { /* Module-level methods (ie, gcloud.credentials.get_credentials)*/ }
},
"gcloud.storage": { /* Python module */
"description": "...",
"type": "module",
"example": "...",
"methods": {}
},
"gcloud.storage.bucket": { /* Python module */
"description": "...",
"type": "module",
"example": "...",
"methods": {}
},
"gcloud.storage.bucket.Bucket": { /* Python class */
"description": "...",
"type": "class",
"example": "...",
"methods": {}
},
"gcloud.storage.bucket.Bucket.__init__": { /* Python method */
"description": "...",
"type": "method",
"example": "...",
"methods": {}
}
}
Thoughts?
I should mention that the convention is to use no docstring on __init__
and to describe the constructor arguments / class attributes in the class docstring.
Ah - fair, we can do that.
Dave told me we can match dynamic routes with Angular, so things are looking to be much easier.
To shift focus for a second, which one of these is our goal:
gcloud-cli
?), so that we can have: https://googlecloudplatform.github.io/gcloud/#/node (/python, /java, etc) Our individual gh-pages can just host our JSON files and index.html can redirect to the base site?I've got the Node site building out a flat JSON file now, like:
{
"bigquery": [
{...},
{...},
{...}
],
"bigquery/dataset": [
{...},
{...},
{...}
],
...
}
Angular loads the file once, and doesn't care what the key looks like, if it's in the JSON, it'll load it.
So, "/docs/master/bigquery/dataset" loads "bigquery/dataset", "/docs/master/bigquery/dataset/abcdef/himom" would load "bigquery/dataset/abcdef/himom". There isn't anything special about the "/", it could be any series of characters, though the "/" does look best since it's a URL.
I special-cased the main API docs route "/docs/master" to look for a key named "gcloud" in the JSON doc.
I'm going to start transforming the JSON into something more like we've been talking about and will report back :+1:
Just bringing up some things to consider:
So here's an example of where we are with a potential common format.
{
"datastore.query.Query": {
"name": "Query",
"description": "<p>Build a Query object.</p><p><strong>Queries should be built with<br />{@linkcode module:datastore/dataset#createQuery} and run via<br />{@linkcode module:datastore/dataset#runQuery}.</strong></p>",
"line": 55,
"type": "class",
"methods": ["datastore.query.Query.autoPaginate", "datastore.query.Query.filter", "datastore.query.Query.hasAncestor", "datastore.query.Query.order", "datastore.query.Query.groupBy", "datastore.query.Query.select", "datastore.query.Query.start", "datastore.query.Query.end", "datastore.query.Query.limit", "datastore.query.Query.offset"],
"params": [{
"name": "namespace",
"types": ["string"],
"description": "<ul>\n<li>Namespace to query entities from.</li>\n</ul>\n",
"optional": true,
"nullable": false
}, {
"name": "kind",
"types": ["string"],
"description": "<ul>\n<li>Kind to query. </li>\n</ul>\n",
"optional": false,
"nullable": false
}],
"example": "var dataset = gcloud.datastore.dataset({\n projectId: 'grape-spaceship-123'\n});\n\n// If your dataset was scoped to a namespace at initialization, your query\n// will likewise be scoped to that namespace.\nvar query = dataset.createQuery('Lion');\n\n// However, you may override the namespace per query.\nvar query = dataset.createQuery('AnimalNamespace', 'Lion');\n\n// You may also remove the namespace altogether.\nvar query = dataset.createQuery(null, 'Lion');"
},
"datastore.query.Query.autoPaginate": {
"name": "autoPaginate",
"description": "",
"line": 80,
"type": "method",
"params": [],
"returns": ["module:datastore/query"]
},
"datastore.query.Query.filter": {
"name": "filter",
"description": "<p>Datastore allows querying on properties. Supported comparison operators<br />are <code>=</code>, <code><</code>, <code>></code>, <code><=</code>, and <code>>=</code>. "Not equal" and <code>IN</code> operators are<br />currently not supported.</p><p><em>To filter by ancestors, see {@linkcode module:datastore/query#hasAncestor}.</em></p>",
"line": 108,
"type": "method",
"params": [{
"name": "filter",
"types": ["string"],
"description": "<ul>\n<li>Property + Operator (=, <, >, <=, >=).</li>\n</ul>\n",
"optional": false,
"nullable": false
}, {
"name": "value",
"types": [],
"description": "<ul>\n<li>Value to compare property to.</li>\n</ul>\n",
"optional": false,
"nullable": false
}],
"returns": ["module:datastore/query"],
"example": "// List all companies named Google that have less than 400 employees.\nvar companyQuery = query\n .filter('name =', 'Google')\n .filter('size <', 400);\n\n// To filter by key, use `__key__` for the property name. Filter on keys\n// stored as properties is not currently supported.\nvar keyQuery = query.filter('__key__ =', dataset.key(['Company', 'Google']));"
}
}
Any thoughts or criticisms?
What do you do for types that you don't own? We can link out to Credentials objects (owned by oauth2client
) and even base types (e.g. int
), though primitives are less important than complex types like Credentials
or an Http
object.
Good question! I'm pretty certain we only deal with either types we own or types native to node. I think we could just format the param/return (via markdown or html) to wrap the type name in a link?
{
"returns": ["<a href=\"http://path/to/external/docs\">Credentials</a>"]
}
Thoughts?
@callmehiphop Can you provide an example of "nested" paramaters (options, callback properties, and so on) in the common format? For example, these options: https://github.com/GoogleCloudPlatform/gcloud-node/blob/master/lib/datastore/dataset.js#L76-L79
@quartzmo my pleasure!
{
"dns.zone.Zone.createChange": {
"name": "createChange",
"description": "<p>Create a change of resource record sets for the zone.</p>",
"line": 149,
"type": "method",
"params": [{
"name": "options",
"types": ["object"],
"description": "<ul>\n<li>The configuration object.</li>\n</ul>\n",
"optional": false,
"nullable": false
}, {
"name": "options.add",
"types": ["module:dns/record", "Array.<module:dns/record>"],
"description": "<ul>\n<li>Record objects to add to this zone.</li>\n</ul>\n",
"optional": false,
"nullable": false
}, {
"name": "options.delete",
"types": ["module:dns/record", "Array.<module:dns/record>"],
"description": "<ul>\n<li>Record objects to delete from this zone. Be aware that the resource records here<br /> must match exactly to be deleted.</li>\n</ul>\n",
"optional": false,
"nullable": false
}, {
"name": "callback",
"types": ["function"],
"description": "<ul>\n<li>The callback function.</li>\n</ul>\n",
"optional": false,
"nullable": false
}, {
"name": "callback.err",
"types": ["error"],
"description": "<ul>\n<li>An API error.</li>\n</ul>\n",
"optional": false,
"nullable": true
}, {
"name": "callback.change",
"types": ["module:dns/change"],
"description": "<ul>\n<li>A {module:dns/change} object.</li>\n</ul>\n",
"optional": false,
"nullable": true
}, {
"name": "callback.apiResponse",
"types": ["object"],
"description": "<ul>\n<li>Raw API response. </li>\n</ul>\n",
"optional": false,
"nullable": false
}],
"example": "var oldARecord = zone.record('a', {\n name: 'example.com.',\n data: '1.2.3.4',\n ttl: 86400\n});\n\nvar newARecord = zone.record('a', {\n name: 'example.com.',\n data: '5.6.7.8',\n ttl: 86400\n});\n\nzone.createChange({\n add: newARecord,\n delete: oldARecord\n}, function(err, change, apiResponse) {\n if (!err) {\n // The change was created successfully.\n }\n});"
}
}
It's probably worth noting that the syntax for the custom types isn't really set in stone, it's just how we had it previously.
Anyone following along here, we've shifted the discussion over to gcloud-common: https://github.com/GoogleCloudPlatform/gcloud-common/issues/33 -- we're finalizing the schema, so please weigh in if you have any objections or just subscribe to stay updated.
Thanks for doing this remembering people watching this thread
Closing this since the discussion moved
From @dhermes in https://github.com/GoogleCloudPlatform/gcloud-python/pull/1092#issuecomment-135033490:
Sharing themes:
I think if we can find a way to generate a JSON output of our docs like the
dox
script does, we could use the exact same theme. I would really really like that./cc @stephenplusplus @callmehiphop to comment
Just using read-the-docs:
I totally get that this is "the Python way to host docs" -- I'm all for it in addition to the gcloud themed docs, the same way I'm all for Javadocs for gcloud-java and Godocs for gcloud-golang (and MSDN looking docs for gcloud-dotnet when we do it). The branding piece is important as well, and I'm fine with us investing the time to continue maintaining these.
Am I correct in the assumption that RTD / Javadocs / etc should take relatively small amounts of work going forward?