Closed tqchen closed 4 years ago
Let me express some thoughts about documentation suggestions from my mind, with the hope that my crude remarks may draw forth by abler people.
What do you think are the important categories of docs?
- In particular, how can be better (re)organize our docs hierarchy
I think Use TVM
should be first and important, not TVM internal development. i.e. Teach users how to integrate TVM into their projects smoothly and quickly should be the first place.
In this part, we should write the tutorial very detail, from Installation
to deployment
end to end. If have code, we should explain it line by line patiently and list the output (current execution status). So, our installation
,Compile Deep learning models
, Auto Tuning
, Deployment
part should be integrated into Use TVM
and make one good example.
Moreover, we should supply FAQ / Notes of some common error messages. For example, ValueError: Direct host side access to device memory is detected... / can not open shared object... and tell users possible reasons and solutions.
Would it be helpful to introduce a global architecture diagram? what should that looks like
Yes. it is very helpful. I think this diagram is good enough.
Propose titles of docs that you would like to say, and a small abstract
- This gives us a sense of what specifically the docs we need
TVM Design and Overview
, For users want to know TVM architecture.
Use TVM
, for users who want to integrate with TVM into their projects.
Develop TVM
, for developers who want to hack TVM.
API
, For developers who look up TVM API.
Resources
TVM educational courses, join the discussion with other TVM developers.
How do they fit into the docs hierarchy?
In TVM Design and Overview
, we could have the architecture diagram said before and Relay / NNVM IR reference and so on.
In Use TVM
, we could integrate with our installation
,Compile Deep learning models
, Auto Tuning
, Deployment
and like so on. More importantly, we should construct one good end-to-end example to show.
In Develop TVM
, we could integrate with our Optimize Tensor Operators
Tensor Expression and Schedules
and like so on.
In API
, we could integrate with our Python API
NNVM Core Tensor Operators
and like so on.
In Resources
, we could list useful tutorials (including external), courses and our discussion forums.
How can we cross-reference docs, should there be a global index into the docs?
Current index is good enough.
When we are doing a particular doc, how can we give the reader a big picture on which part of the system we are talking about?
Good title should have a help. However, we should list the information on the header of doc, includes the technical area (like frontend, topi and so on), hardware (ARM CPU, GPU, FPGA and so on).
Having just started to look into using TVM I can only agree with what @FrozenGene is saying in its first part. A hands-on tutorial and please do not just focus it on Python. I think there are plenty of people that do not want to integrate this as another Python framework as there are so many others out that just are Python focused already.
I'll get it Python is create for experimenting and doing network design and training but it is not a suitable platform in my eyes for serious end applications. And the C++ deployment documentation so far is thin (that's the best I can say)
@FrozenGene In the Use TVM
, which maybe better titled Getting Started with TVM
, I think you will need to create a progression of examples, progressively more complex, and organized by DL discipline (image, sound, text). Trying to mimic the organization of the TensorFlow tutorial IMHO would be good, many folks are familiar with it, it has been battle tested on tens of thousands of folks, and it would create a direct connection/familiarity with the TF crowd.
@FrozenGene In the
Use TVM
, which maybe better titledGetting Started with TVM
, I think you will need to create a progression of examples, progressively more complex, and organized by DL discipline (image, sound, text). Trying to mimic the organization of the TensorFlow tutorial IMHO would be good, many folks are familiar with it, it has been battle tested on tens of thousands of folks, and it would create a direct connection/familiarity with the TF crowd.
@Ravenwater Yes. I propose Use TVM
before just want to emphasize we should care more of usage
, not development
. I agree Getting Started with TVM
is a better name. :-)
Propose titles of docs that you would like to say, and a small abstract This gives us a sense of what specifically the docs we need
writing new construct in relay has become incredibly tediuos. when i add reference I have to: 0: update expr 1: update expr_functor, expr_visitor, expr_mutator 2: update structural_hash, ir_printer, alpha_equal 3: update interpreter, fuse_ops, memory_planner, to_anf, free_vars. It is very long. Also what if I miss something? (in fact i miss out init.py, did you catch it?)
the same apply for adding another function - pass.py, test_xxx.py, pass.h, xxx.cc, etc. even only for the sake of dogfooding the relay team, it is good to have a internal developement guide - for now it can be a list of stuff to do (including updating the list) to add a new construct/operation.
What do you think are the important categories of docs?
I second @FrozenGene's comment. Right now, there's no clear hierarchy in tutorials. We need to consider our users and their level of familiarities. What we would like to give them first and have more for those with more interests. Writing great tutorial is an art and it needs caring which we should strive for when heading to v1.0 release.
What we have is good but not good enough. We need
What is TVM?
think how you'd explain it to your parents and not how you write paper!If getting more contribution to the docs is also part of the goal, then it's worth thinking about how to shorten the feedback cycle. Currently in TVM, if people find an issue in the doc, they would need to do one of the following:
We tried these things in other projects:
IMHO, I need two things to get started quickly.
how to use? An example that uses almost 80% features of the platform (maybe we need one for each platform, such as cpu, opencl, cuda, meta). Currently, we actually have almost all of them, but one has to take a while to find a simple example for meta or fpga. At least, we need a CPU only example. For the employees from AI ASIC accelerator companies, lots of them do not have GPU and the platforms on their computer. If they want to try TVM simply without GPU, they have to eliminate cuda and so on to get an example to work.
how to hack? llvm gives users the ability to generate IR from clang and also provides ways to output IR after each pass. Currently, I still do not know how many different IRs will be generated from TVM model to cuda (computation graph, halide like for loops, cuda)? Can we generate the textual presentation of the computation graph after graph optimization in nnvm or relay? I think it is better to have an example to show how to hack by adding options to a command (or adding print function in python). We have graphs to show the structure of TVM, but it would be better if there is an example with the graph.
@szha "edit button" will be very helpful.
I agree with @szha's points.
+1 for edit button +1 for Disqus panel
BTW, I feel like gluon-nlp's cli demo is super cool stuff that are nice to have.
I want to clarify that people can directly send a pull request to change the docs, without having to send an RFC issue. RFC is only necessary for new features
@tqchen I didn't see the RFC process in the contribute guide. Maybe the RFC process (and that doc change doesn't require one) needs to be documented here: https://docs.tvm.ai/contribute/
Agreed, that is something that would need to be properly documented, so far the decision of (whether RFC is necessary) is done by lazy consensus from committers
Maybe starting by improving inline comment style documents? Many of them are only sentences built by five around words like example below. When I using these APIs, I really need to read the implementation detail to know what will happen :(
def func_example(a):
"""This is Func example.
Parameters
----------
a: str
arg a
"""
@jackwish it would be great if you can raise questions about the functions that you would like see improvements, contributions are more than welcomed
@jackwish Agree. Maybe add usage cases as examples in the comments would be helpful. When I start with something that I don't really know, I usually look into the tests
folder and locate the code segments that calls the functions/classes. If we can put some brief code samples as Examples
along with Parameters
, it could be more helpful to understand what would happen. A typical example would likely to be the following:
def func_example(a):
"""This is Func example.
Parameters
----------
a: str
arg a
Examples
--------
>>> a = A(p0=1, p1=2)
>>> b = func_example(a)
b = 1 + 2
"""
On the other hand, I found there is unittests
and integration
tests available for Jenkins. However, code base in apps and tutorials are mostly not tested by Jenkins. To improve the quality of the code base with continuous development and integration, are we considering to add apps and tutorials into Jenkins test?
tutorials and some of the apps(extension) are tested by Jenkins.
@tqchen I will look into things to find where I may contribute some. @liangfu I think code example is a good idea to help people know how to use, and TVM has many tutorials which the inline document can points to. I intended to use that example for two aspects: Some APIs don't need the one sentence style document since the API itself is enough. On the other hand, some other APIs actually need a (or several) paragraph(s) to describe the functionalities, restrictions and so on.
Like many people have pointed out, maybe we can categorize the materials for users and developers, as users are likely to import/tune/deploy models rather than to dive into IR things.
Managing decent documents requires extra effort for contributors. Balancing the tradeoff is art :)
There are some good points that are mentioned above - having a set of good getting started guides based on different usecases would certainly be a good starting point.
One of the things that I found hard to get started with was using tvm as a user was the absence of canned frontends as at the end of the build process one just gets a set of libraries. It is then that one goes and looks at the tutorials for checking that things work just fine - further figuring out the exact version of tensorflow that works with the current trunk of tvm took a bit of time as one cannot expect tensorflow==2.0.0-alpha to work out of the box. Thus follows the question of whether we should also be documenting the versions of the various frameworks that work with TVM for a given version of TVM .
I also think that it is worth also creating an issues page which details what is expected in a bug report and how a user should report a bug / what's the minimal set of things we require to see in a bug report so that someone else can reproduce it in order to drive a higher quality of bug reports over time for the community.
A link to the ci system from the main website seems due. I stumbled across ci.tvm.ai by accident but it would be good to have some of this handled from the top level.
And thanks for reading till here.
regards Ramana
Dear all, @tqchen
Do we have good documentation for TVM? I am interested in learning more about the TVM compiler optimization strategies especially for Scientific Data for HEP, NP, and DL models for these applications. Any feedback or hint where to start in the source code.
I had been working on pytorch a bit lately. One thing that is really great about it, is that pytorch has a 'architecture/code structure' document. It is a detailed description of the core system that allow a good outside coder to build a 'mini-pytorch' just by reading it. Should such a thing exist for tvm?
@MarisaKirisame #6097 aims to achieve part of the goal
Closing for now, the new design and architecture guide now lands in the official docs https://tvm.apache.org/docs/dev/index.html let us open new ones for further improvements
There has been a growing demand for better documentation which many of us totally agree. Good developer documentation makes others to quickly jump in to use TVM and helps us to get more community users.
We need your help in giving thoughts on how we can do better, in particular:
Please share your thoughts. We will keep this RFC open for a while. The committers and PMCs will then summarize the discussion into a documentation roadmap and we will make that our top priority in the 0.6 release cycle.
cc @dmlc/tvm-team
Please Comment on Global Docs Organization
This is something that we especially love to pick everyone's brain on. It is very easy to get a bunch of documents, but if we don't organize them clearly and provide pointers they will not be as helpful.