Extensible rules for Static Analyzer

Quuxplusone commented 15 years ago


Bugzilla Link	PR3630
Status	NEW
Importance	P enhancement
Reported by	Ian Symondson (ian.symondson@l-3com.com)
Reported on	2009-02-20 06:55:39 -0800
Last modified on	2010-02-22 12:47:28 -0800
Version	unspecified
Hardware	PC Windows XP
CC	ian.symondson@l-3com.com, kremenek@apple.com, llvm-bugs@lists.llvm.org, nikita@zhuk.fi, xu_zhong_xing@163.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

We are developing in Objective-C and are hoping to use clang static analysis on
our software. One of our additional desires is to be able to check code against
our coding standards. Does something exist which would enable us to add some of
our own rules for parsing, or are there any intentions to provide such a
feature?
Thanks in advance for your help.

Quuxplusone commented 15 years ago

Hi Ian,

Your question is fairly open-ended, so it isn't clear to me what form of extensibility you are looking for right now. I'm going to give a broad, short answer here, and then follow up with more detail once I get a better idea of what you are looking for.

The analyzer itself is written in a highly modular fashion, and new checks are implemented incrementally. If you wanted to get your hands dirty in the C++ guts of the analyzer, there are several very natural entry points for you to attach your own custom checks. While the analyzer itself doesn't load custom checks via shared libraries, the necessary infrastructure in LLVM is already there to support plugins. Such plugins would interface with Clang and the analyzer via the C++ APIs, giving them direct access to the ASTs and analyzer data structures. Many checks are actually very easy to write, as the analyzer core does most of the heavy lifting when reasoning about the flow of values, aliasing, etc.

As I am sure you are aware, the analyzer itself is a very long-term project. I would like to support both a plugin model for the "low-level" C++ APIs as well as (eventually) support higher-level interfaces, e.g. scripting, for writing custom checks. The idea would be to provide an interface to analyzer that doesn't require expertise of the nitty gritty details of C, Clang, or the analyzer, yet allow checker writers to encode the often well-structured rules that appear in their APIs in custom checks. I don't have an ETA for this; it is a project onto itself, and expect it to gradually happen as community interest in the analyzer grows.

I would be more than happy to talk about both directions. For this discussion it might be helpful to give us a better idea of what kind of checks you were interested in writing. I could then elaborate more on what is also possible to implement in the analyzer at this point (and how) as well as what checks would require various pieces of the analyzer to be further developed before they could be implemented.

Quuxplusone commented 15 years ago

I've also CC'ed Nikita Zhuk and Zhongxing Xu on this Bugzilla report. Nikita has been working on his own fork of Clang and the analyzer to implement his own custom Objective-C checks that enforce various coding standards at his software shop:

http://www.karppinen.fi/analysistool/

Zhongxing is the biggest contributor to the analyzer core other than myself, and is helping drive key aspects of its internal algorithms, infrastructure, and overall design.

Quuxplusone commented 15 years ago

(In reply to comment #1)
> Hi Ian,
>
> Your question is fairly open-ended, so it isn't clear to me what form of
> extensibility you are looking for right now.  I'm going to give a broad, short
> answer here, and then follow up with more detail once I get a better idea of
> what you are looking for.
>
> The analyzer itself is written in a highly modular fashion, and new checks are
> implemented incrementally.  If you wanted to get your hands dirty in the C++
> guts of the analyzer, there are several very natural entry points for you to
> attach your own custom checks.  While the analyzer itself doesn't load custom
> checks via shared libraries, the necessary infrastructure in LLVM is already
> there to support plugins.  Such plugins would interface with Clang and the
> analyzer via the C++ APIs, giving them direct access to the ASTs and analyzer
> data structures.  Many checks are actually very easy to write, as the analyzer
> core does most of the heavy lifting when reasoning about the flow of values,
> aliasing, etc.
>

Current various basic checkers are hard coded directly in GRExprEngine. As a
starting
point of the pluggable checker infrastructure, how about to move them into
separate pluggable
transfer functions? To be more specific, GRExprEngine does basic semantic
simulation.
GRTransferFuncs subclasses do various checkings and necessary semantic
simulation for that
checking. This is only imagined structure. I'm not sure of its feasibility.

Quuxplusone commented 15 years ago

Hi Ian,

I'm been very excited about Clang in particular because it allows me to write custom checks for our coding standards and to automate the use of knowledge about coding practices and patterns gained by developers' experience and manual code reviews. As Ted already mentioned, I've been writing my own checks for Obj-C code at our company, and we have released the app for others to use as well (the app is still in very early stage and I have a long TODO list for it). I can confirm that writing basic (but still useful) checks has been fairly easy for non-compiler guy like me.

Quuxplusone commented 15 years ago

(In reply to comment #2)
> I've also CC'ed Nikita Zhuk and Zhongxing Xu on this Bugzilla report.  Nikita
> has been working on his own fork of Clang and the analyzer to implement his
own
> custom Objective-C checks that enforce various coding standards at his
software
> shop:
>
> http://www.karppinen.fi/analysistool/
>
> Zhongxing is the biggest contributor to the analyzer core other than myself,
> and is helping drive key aspects of its internal algorithms, infrastructure,
> and overall design.
>

Apologies for not replying sooner to your question. I have been trying to find
out specifics of what we want to be able to do and what is already available in
clang static analysis. Is there a list of exactly what things are checked by
the static analyzer? This would help a lot.
Thanks,
Ian

Quuxplusone commented 15 years ago

(In reply to comment #5)
> (In reply to comment #2)
> > I've also CC'ed Nikita Zhuk and Zhongxing Xu on this Bugzilla report.
Nikita
> > has been working on his own fork of Clang and the analyzer to implement his
own
> > custom Objective-C checks that enforce various coding standards at his
software
> > shop:
> >
> > http://www.karppinen.fi/analysistool/
> >
> > Zhongxing is the biggest contributor to the analyzer core other than myself,
> > and is helping drive key aspects of its internal algorithms, infrastructure,
> > and overall design.
> >
>
> Apologies for not replying sooner to your question. I have been trying to find
> out specifics of what we want to be able to do and what is already available
in
> clang static analysis. Is there a list of exactly what things are checked by
> the static analyzer? This would help a lot.
> Thanks,
> Ian
>

Hi Ian,

Recently I have been working on writing up for the website a list of checks
that are currently implemented.  I hope to have that up soon.

Aside from the retain/release checking, most of the checks have to do with
language semantics, e.g. null deferences, uses of uninitialized variables, zero-
size VLAs, etc.  There are a few API specific checks, e.g., correct uses of
CFNumberCreate, but there are currently only a handful of those right now.  API
specific checks aren't hard to implement; for myself I've mainly been
prioritizing my efforts recently on other things (e.g., enhancing the core
analysis engine).  I also tend to not like to add a check unless I feel I have
time to do a good job on that check.  Given the option of not having many
checks in the analyzer or having the analyzer do a good job on a few checks, I
would choose the latter.

That said, I obviously wish the number of checks in the analyzer to increase.
My belief is that over time additional high-quality checks will become
increasingly available to users, and this trend will be accelerated as
contributors with specialized knowledge of an API or domain write checks that
would be useful to others.

Quuxplusone / LLVMBugzillaTest

Extensible rules for Static Analyzer #4026