getify / concrete-syntax-tree

Defining a standard JavaScript CST (concrete syntax tree) to complement ASTs
107 stars 2 forks source link

Register interest #1

Closed getify closed 9 years ago

getify commented 10 years ago

Please leave a comment here with your interest in participating. List whatever stake you may have (such as "I work on XYZ tool that needs this..."), and any relevant contact info.

Also, please list what timezone you are primarily based in, for meeting planning purposes.

List any relevant contact info (twitter, email, etc) as you see fit.

getify commented 10 years ago

I'm Kyle Simpson (aka "getify"). My interest is as an initial organizer of this effort, as I am writing various tools (code transpilers, code formatters) that need a CST format. In particular, I need a code parser (like Esprima, Acorn, etc) that can give me a CST without losing information while processing a file. I'd also like to be able to produce a CST (perhaps modifying an existing one) and have a code generator (like escodegen) that produces the appropriate code.

I'm generally in CST timezone (currently GMT-5). I can be reached at twitter @getify or via gmail (same name).

getify commented 10 years ago

Hopefully @constellation @dherman @davidbruant @michaelficarra and others will come register here. :)

michaelficarra commented 10 years ago

I highly recommend anyone posting here read https://github.com/Constellation/escodegen/pull/133 which culminated in two proposals detailed here: https://github.com/Constellation/escodegen/pull/133#issuecomment-25893771

edit: I just saw that this information is also included in this project's README, which I have not yet read.

I work on a bunch of JS tooling including escodegen, esmangle, esprima, esquery, and eslint among others. I also work on CoffeeScript compilers and participate in other compile-to-JS language communities. It should be obvious why all of these tools could benefit from a CST standard. As of April, I will be spending most of my time in the US Pacific timezone (US Central until then). Twitter: @jspedant. Email can be found on my github profile.

Constellation commented 10 years ago

I'm Yusuke Suzuki (a.k.a. Constellation). I has been working on JavaScript AST tools - esprima, escodegen and so on. My primary interest is constructing composable JS tools infrastructures. At highly abstracted level (AST), this is done by JS AST. But we need the way to control lower level (closer to source code). I need CST to gain more fine grained control over JS code generation.

Primarily I'm based on JST (GMT+9) timezone. Feel free to comment to my twitter account: @Constellation or gmail (utatane.tea@gmail.com).

edit: opened an issue to generate CST from escodegen. Constellation/escodegen#177

dangoor commented 10 years ago

Hi @getify et al. I'm interested in this topic as well, as it relates to my work on Brackets.

DavidBruant commented 10 years ago

I'm interested, mostly out of being curious. No particular application in mind.

ljharb commented 10 years ago

My interest is that a CST would allow for non-Pratt parser- based linters (ie, not jslint/jshint) to be built that can enforce style, including whitespace and inside comments.

michaelficarra commented 10 years ago

@ljharb: Check out eslint.

kriskowal commented 10 years ago

My interest is that this has the potential to simplify a JSDoc alike parser, in that it would not be necessary to correlate comment positions with declaration positions. However, it would also be useful to be able to cross reference the concrete and abstract syntaxes, since the abstract is more useful for type inference and scope analysis.

jednano commented 10 years ago

I work on CodePainter, which needs access to CST whitespace information to check, fix and infer JavaScript.

Most important is the following information:

CST would allow me to incredibly simplify the code base. I would be interested in contributing to formatters from CST back to source code (e.g., spaces after control statements).

Funny – my time zone is also CST.

You can contact me via email, Google Plus or Google chat using jed dot hunsaker at gmail.

nzakas commented 10 years ago

I'm working on ESLint, and we've been struggling to deal with these issues, as well. Consider this me registering my interest.

In the PST.

fpirsch commented 10 years ago

Interested too.

joeedh commented 10 years ago

Interested. Would there be support for ES6 syntax, too?

jsoverson commented 10 years ago

I'm interested. I work on plato which acts mostly as a consumer to other libraries in this thread but this would provide useful for analysis, finding patterns, and additional presentation of code based metrics.

getify commented 10 years ago

Thanks for all the great interest so far!

Reminder: we need to coordinate an initial online meeting, so it would be very helpful if you'd edit your response in this thread and include what timezone you are normally in. We want to find something that the most people can attend easily. :)

brandonpayton commented 10 years ago

I am interested, initially due to the challenges facing eslint. I am in the PST (UTC-08) timezone.

jsoverson commented 10 years ago

Pacific time, in San Diego.

jednano commented 10 years ago

I would add that a Google Hangout seems rather appropriate for such a meeting.

gkz commented 10 years ago

I am interested. I work on LiveScript, a compile to JS language, but I am mostly interested due to my work on Grasp, JavaScript structural search and replace (ie. based on the AST). While searching with Grasp doesn't require a CST (that's the point), doing replacement could benefit from this proposal. When doing replacement, the original formatting must not be lost, so the current system uses location data from the nodes to modify portions of the source text, but working on a CST then doing codegen would be nicer.

Timezone: PST

dxnn commented 10 years ago

I'm interested. In EST, but on PST schedule.

wesleyhales commented 10 years ago

Interested, PST

zaach commented 10 years ago

I'm interested. I wrote a JavaScript parser (reflect.js) that I'd like to eventually support CSTs. PST.

Elergy commented 10 years ago

I'm interested too. I'm in UTC+04 timezone, but this is no problem for me.

jeffmo commented 10 years ago

Very much interested here. I work at Facebook on our JS parsing and and static code analysis tools, and we use esprima for a lot of things right now. We've had to do some relatively hacky workarounds to estimate or best-guess concrete syntactic elements in many of our tools and libs -- and a formal lossless CST would do wonders for us!

If we can gain some momentum here, I would also be willing to help encourage adoption by converting some of our open source tools to use any such standard we may come up with here.

I am on PST in the bay area Twitter: twitter.com/lbljeffmo

Munter commented 10 years ago

Interested, thought probably not qualified to contribute a lot. I'm involved in assetgraph, a static code analysis tool that uses ASTs to detect dependencies between files. Timezone: CET Twitter: @munter

Ping @papandreou

jzaefferer commented 10 years ago

I'm interested in the effort, though I have no intention at participating in the design, since that's too far away from my expertise. That said, a tool I've been contributing to, esformatter, makes use of rocambole, which builds on esprima. That seem to be exactly the usecase you're talking about, so the format produced by rocambole should be a good reference. Maybe @millermedeiros, who wrote rocambole and esformatter, is interested in participating.

hegemonic commented 10 years ago

As the maintainer of JSDoc, I wish I could participate. However, I now work at Google, and our legal department gets nervous about assignment "to the public domain" without any other detail about what that means.

Please consider adopting the Creative Commons CC0 license (full text), which achieves the same purpose and provides more legal certainty for everyone involved. Creative Commons has written a helpful explanation of CC0's benefits.

I'm on Pacific time.

micmath commented 10 years ago

@getify where were you when we started JSDoc? :) Yes, yes, a thousand times yes! we need this. Meaning, "I am interested."

Primary interest: Improving documentation tools like JSDoc Twitter: micmath

getify commented 10 years ago

@hegemonic you bring up a good point, but my intent to prevent there being any ownership/patentability of the ideas we discuss here seems to fall short of CC0, so I guess we'd have to adopt that plus something about an implicit CLA that gives up any rights over anything discussed here. I'd be interested to hear how Google would view something like that.

millermedeiros commented 10 years ago

When I decided to code esformatter I also realized that I needed the white space and comments. It was also very important to be able to retrieve which tokens are inside a "node" and an easy way to rebuild the program after the manipulation.

I also thought about adding something like extras to the AST nodes, but realized it would be too hard to rebuild the code, so I decided to create a linked list of all the tokens (including white space and comments) and use that instead. Converting the tokens into string is WAY EASIER than writing something like escodegen. (they are also easier to manipulate in many cases)

Also important to note that ; and () can be ambiguous on the AST in many cases.

I can see benefits of a concrete syntax tree but I don't think that shoehorning an AST into CST will give good results. I think this problem requires a specific tree structure, probably Left-to-Right and using more Arrays than a regular AST. (structure sequence can be a mix of nodes and tokens..)

I'm going to give some thought about the problem and do some research as well. Maybe we can borrow from other languages. - But we should stick with the SpiderMonkey node names if possible.

I'm at GMT-3 (Brazil) but can stay awake till late if needed (or wake up early): m@millermedeiros.com, http://twitter.com/millermedeiros

hegemonic commented 10 years ago

@getify: Under US copyright law, I think it's difficult, if not impossible, to establish that nobody owns an idea. Also, ownership is central to CC0--you can't say "as the owner, I surrender my rights" unless you can also say "I am the owner."

If you don't feel that CC0 achieves all of your goals, you might consider adopting the same approach as SQLite, which Google's lawyers seem reasonably happy with.

Sorry to be a pain in the ass. I don't like US copyright law either. But I know this issue matters to my employer, and I suspect that it matters to some of the other companies represented in this thread.

getify commented 10 years ago

@hegemonic I appreciate very much your input. My position is to try to give a best effort, but as IANAL, I just want to operate as plainly and simply as we possibly can. This is an ad hoc group, and my goal is to make it easy for people to participate. We absolutely must get developers from the various browser engines to participate (or at least track!) what we discuss, because if they don't buy in to what we're suggesting, we've failed. So it's an explicit goal of mine to make it clear that this process is totally separate from any proprietary/confidential/patentable/copyrightable processes that such developers may also find themselves in.

To that end, I've tried to clarify the "license" section, and I hope it makes it clear that there's an implicit granting and licensing of all rights to, as you suggested, the CC0 public domain, when anyone submits publicly to our discussions. Without going down the route of having legal agreements, which I strictly will resist, I hope the wording is clear enough to keep browser developers "in the mix". But IANAL, so YMMV. :)

https://github.com/getify/concrete-syntax-tree/blob/master/README.md#license--assignment

jrootham commented 10 years ago

I am poking at a generalized CST system. https://github.com/jrootham/polyglotDEbootstrap Progress rate on that is an open question. I am in EST. twitter @jrootham

getify commented 10 years ago

@jrootham are you meaning "generalized" in the sense of being able to represent concrete syntax from any of various languages? I think we're clearly "scoped" here to only care about creating a tree syntax that suits JS syntax well.

jrootham commented 10 years ago

Hi Kyle

Yes, that is what I mean by generalized. That's my project. I don't expect it to take over this project, but I am interested in what happens here. If I can make a contribution, that's good too.

Jim

On Fri, Mar 21, 2014 at 9:56 PM, Kyle Simpson notifications@github.comwrote:

@jrootham https://github.com/jrootham are you meaning "generalized" in the sense of being able to represent concrete syntax from any of various languages? I think we're clearly "scoped" here to only care about creating a tree syntax that suits JS syntax well.

Reply to this email directly or view it on GitHubhttps://github.com/getify/concrete-syntax-tree/issues/1#issuecomment-38340057 .

ariya commented 10 years ago

primary interest: I started and maintain Esprima timezone: PST twitter: ariyahidayat

getify commented 10 years ago

Everyone, please go read my proposed Charter/Mission in #4

edwardgerhold commented 10 years ago

I have a fun project, syntax.js, with Mozillas Parser API and a few nodes for highlighting and evaluating ES6 code. My codegen doesn't put out comments or whitespaces, too. Beside all i want to do in syntax.js, i am interested in an interoperable and up-to-date format, too.

ericelliott commented 10 years ago

I'm interested in applications for JSDoc static analysis, style enforcement, and code generation. PST.

hegemonic commented 10 years ago

@getify: I spoke with Google's legal team. Unfortunately, they can't agree to the additional language you've proposed beyond the CC0 license. In particular, they don't believe that there's a legal mechanism to waive patentability in most countries.

Sorry, but I'll have to withdraw my participation.

michaelficarra commented 10 years ago

@hegemonic: I suggest working for a company that is more open to this sort of thing being done in your personal time.

getify commented 10 years ago

@hegemonic Sorry to hear that. Google is well aware of the need for its employees to participate in groups where there is an explicit disclaimer/exclusion of any patentability. They have a bunch of employees that participate in TC39, W3C/WHATWG, etc, and those all explicitly require everyone including google employees to disclaim patent rights. Your legal team is either confused, being disingenuous (they know full well this is common), or are expecting you to only participate where there are lots of complex legal documents involved. That's a shame. But we're not going to go down that route for this group. I hope they'll reconsider. But that's between you and them. :/

hegemonic commented 10 years ago

@getify: To be clear, Google's lawyers are only taking issue with the specific language used for this project. Your other statements are not correct.

ericelliott commented 10 years ago

Perhaps the legal team could suggest changes to make the wording more like the other standards efforts that they're OK with.

getify commented 10 years ago

I know I signed an agreement when I started participation in W3C that required me to agree that I was not under some delusion that I would be able to patent what I submitted. Furthermore, every time a specification moves from draft into more official status, I'm always asked to exclude anything from the spec that I would want to be patentable.

IANAL, but I can assure you that I don't want a single word of what we produce to be anything that anyone at any patent-happy company thinks they can claim ownership over. This is a community effort in the public domain. The spirit of my "specific language" is to make that clear, and weed out any confusion before we get further down the road. Unfortunately, it seems like Google wants to reserve the right to patent things like this if they so wished, so it would appear a Google employee cannot comply with the spirit of our effort. That's a real shame. But it doesn't change the spirit of what we're going to do.

getify commented 10 years ago

Moreover, since our effort here is not under the protective umbrella of some official organization with lots of legal documents and lawyers protecting, we have to be even more careful that we not get shot in the foot by being unaware of things that lawyer-happy companies are aware of. I hope it's clear what the spirit of this effort is, and I hope as many people can participate as possible. It's regrettable that there will be cases like this where the efforts are incompatible with some companies.

hegemonic commented 10 years ago

@getify: Again, your statements about Google are not correct.

I won't be responding to any other comments in this thread. Please email me directly if you want to discuss this further.

getify commented 10 years ago

@hegemonic I haven't made any assertions about Google, only observations and "seems like...", mostly from your vague claims about what their legal department is telling you. I don't know how Google works. You're an employee there, you would know infinitely better than I would.

I have tried to make it clear that I would like for Google employees to participate in this process. I tried to adopt the CC0 as you originally suggested because I wanted to give my best effort to that.

I don't know how Google feels (other than inferences from what you've said), but I do know how I feel. So I'm making claims about that. And what I feel is, we're not going to operate the process under any pretense that what's being done here is going to be encumbered by patents. I've made that abundantly clear by now. I leave it to each person to decide how they (or their company) do or do not feel about that.

dberlin commented 10 years ago

@getify So, as the person who gave @hegemonic the legal advice, let me do my best to get this discussion back on the rails. I'm not sure where the assumptions you are using come from, but let me hopefully correct them.

  1. There are some little wording concerns. These are easy to fix with redlines.
  2. There are some more serious concerns.

In particular, the problem on our side is actually that we are perfectly fine with not having patents involved in this process (great, one less thing for me to worry about), but your language as is doesn't accomplish this.

You can't "yada yada" intellectual property rights, as the policy does for everything but copyright (since CC0 only covers copyright, and explicitly says it does not say anything about patent or trademark rights). It is not enough to say "you will not retain any IP rights". This kinda works for copyright, but for things like patents, it's not even clear what such a thing would mean, and a court isn't going to simply say "you participated, you can't patent stuff" (in fact, they've said the opposite).

You really need to be very very explicit what you want to happen for both patents and trademarks, or else you will get nothing.

If you want "you agree not to file patents on anything discussed or produced during discussions, and any patents you do accidentally file must be non-asserted", that is what you should say.

Google happily contributes to a large number of open source projects, has open source patent pledges, etc. We often non-assert or grant patents to open source projects. We participate in plenty of processes where we agree not to file patents. We actually prefer to release stuff under licenses like Apache, with an explicit grant. Heck, i've spent a large part of my career fighting for patent reform.

But we really do not want to be in the same situation as what happened to the companies participating in the Rambus JEDEC debacle (See http://en.wikipedia.org/wiki/Rambus#Lawsuits).

I do not believe complex legal documents are required here. We did not use complex legal language in the LLVM developer policy, for example. However, I am 100% positive your current wording does not accomplish what you want. If you'd like help with that, i'm happy to do what I can.

getify commented 10 years ago

@dberlin I am sorry if I was inferring incorrectly from the previous comments. I knew that Google does participate in lots of OSS and does do so in a very useful way for the community. It was confusing why there would be an issue with making that clear about this project. Glad to know I misunderstood. I am absolutely happy to have the clarifications, and I very much appreciate it.

[Edit: The subject of my confusion was it seemed to me that the objection was "too much" when it turns out the objection was "not enough"]

I think I signaled my willingness to adjust (maintaining the spirit) by my initial switch to CC0 per @hegemonic's suggestion, so I would happily entertain suggestions on how we can make the wording work for everyone.

If we can accomplish:

Then it's totally consistent with the spirit of the project, and I would be grateful for any assistance to get it that way. :)

If you would like to discuss this in more detail outside the context of this thread, feel free to email me on gmail with the same username as here.