sourcegraph / srclib

srclib is a polyglot code analysis library, built for hackability. It consists of language analysis toolchains (currently for Go and Java, with Python, JavaScript, and Ruby in beta) with a common output format, and a CLI tool for running the analysis.
https://srclib.org
Other
942 stars 62 forks source link

src lint: warn if offsets (in refs, defs, docs, etc.) are char offsets, not byte offsets #110

Open sqs opened 9 years ago

bsummer4 commented 9 years ago

Currently, src produces incorrect results for srclib-haskell if it outputs byte offsets.

sqs commented 9 years ago

Yes, that's because it has that hack that converts all non-whitelisted source unit offsets from characters to bytes (which breaks if they're already bytes). Feel free to commit a change to that if statement.

bsummer4 commented 9 years ago

The current behavior is actually much more convenient for srclib-haskell. I was hoping to delete the code that handles the conversion from char-offsets to bytes.

Could we make this opt-in, instead of having hard-coded special cases?

sqs commented 9 years ago

Yeah, making it opt-in seems good.

bsummer4 commented 9 years ago

Do you think that setting {Srclibtoolchain.graph.OffsetType = "bytes" | "characters"} would be an appropriate way to opt-in?

On Thu, Jan 22, 2015 at 12:34 PM, Quinn Slack notifications@github.com wrote:

Yeah, making it opt-in seems good.

— Reply to this email directly or view it on GitHub https://github.com/sourcegraph/srclib/issues/110#issuecomment-71093507.

sqs commented 9 years ago

Yeah, that sounds good to me. Then I think we'll have to construct the Makefile recipes (that src make builds/uses) differently depending on the value of OffsetType, since right now all of the graphers pipe to | src internal normalize-graph-data to do the conversion. That normalize-graph-data subcommand has no easy way of knowing the source toolchain's config, but we do know that at Makefile creation time.

bsummer4 commented 9 years ago

Ok, so: src reads the Srclibtoolchain file, creates a makefile, that makefile currently pipes the graph output to "normalize-graph-data", which currently looks at the unitType to see if it needs to convert the offsets. Is this correct?

How about we change how the makefile is generated, so that normalize-graph-data is passed a flag?

This approach would avoid breaking things during transition. Pass a flag only if the field is set in the Srclibtoolchain, If no flag is passed, use the current hard-coded rules.

sqs commented 9 years ago

Sounds good!