Open CyberShadow opened 3 years ago
IASM GCC Grammar:
GccAsmStatement:
asm FunctionAttributes(opt) { GccAsmInstructionList }
GccAsmInstructionList:
GccAsmInstruction ;
GccAsmInstruction ; GccAsmInstructionList
GccAsmInstruction:
GccBasicAsmInstruction
GccExtAsmInstruction
GccGotoAsmInstruction
GccBasicAsmInstruction:
AssignExpression
GccExtAsmInstruction:
AssignExpression : GccAsmOperands(opt)
AssignExpression : GccAsmOperands(opt) : GccAsmOperands(opt)
AssignExpression : GccAsmOperands(opt) : GccAsmOperands(opt) : GccAsmClobbers(opt)
GccGotoAsmInstruction:
AssignExpression : : GccAsmOperands(opt) : GccAsmClobbers(opt) : GccAsmGotoLabels(opt)
GccAsmOperands:
GccSymbolicName(opt) StringLiteral ( AssignExpression )
GccSymbolicName(opt) StringLiteral ( AssignExpression ), GccAsmOperands
GccSymbolicName:
[ Identifier ]
GccAsmClobbers:
StringLiteral
StringLiteral , GccAsmClobbers
GccAsmGotoLabels:
Identifier
Identifier , GccAsmGotoLabels
@ibuclaw Thanks!
So, that raises the question of where this should be kept / maintained:
GccAsmStatement
as a possible option for AsmStatement
, OSLT.Any thoughts?
Any thoughts?
Probably yes on all three counts.
How is GCC-style assembly actually used in the wild? If I understand correctly, you can't just put it inside a version(GDC)
block since code in there still has to parse, so I suppose people either:
Currently iasm.d looks like:
version (MARS)
{
import dmd.iasmdmd;
}
else version (IN_GCC)
{
import dmd.iasmgcc;
}
Can't both compilers at least support parsing both, and have GccAsmStatement in their grammar? Or is there ambiguity?
@ibuclaw https://github.com/libmir/mir-cpuid/blob/4add5b639351f15044a8c991ce3a905d7973c3de/source/cpuid/x86_any.d#L606-L609 still doesn't parse with the grammar you posted, is that expected?
How is GCC-style assembly actually used in the wild? If I understand correctly, you can't just put it inside a
version(GDC)
block since code in there still has to parse, so I suppose people either:
Nope, you put it in a version condition. The parser just collects all the tokens within braces, and then the semantic parses the contents.
@ibuclaw https://github.com/libmir/mir-cpuid/blob/4add5b639351f15044a8c991ce3a905d7973c3de/source/cpuid/x86_any.d#L606-L609 still doesn't parse with the grammar you posted, is that expected?
That's deprecated syntax, and will be an error soon™
Can't both compilers at least support parsing both, and have GccAsmStatement in their grammar? Or is there ambiguity?
Looks like LDC does a lazy check on the first token for the easy cases where it cannot be a dmd-style iasm statement. This would mean that there are some styles of asm
that LDC does not support.
asm { ctfeFunctionReturningInstructionsAsString(); }
There are a number of ambiguities between an dmd-style AsmExp
and the AssignExpression
instruction string of the gdc-style, i.e:
enum cpuid = "cpuid";
asm { cpuid; } // accepted dmd-style asm
These could perhaps be overcome by tweaking the grammar to force instruction expressions (not string literals) to be enclosed in parentheses in the gdc style.
A small change of plans: https://github.com/CyberShadow/tree-sitter-d/commit/ccfcf13144a4a6dfe83d19b17cb1100e24d8ad73
I would encourage looking at https://github.com/gdamore/tree-sitter-d
I believe it parses pretty much everything correctly.
I've tested it with the DMD test suite, with DMD itself, with DUB, and a large body of non-public work.
Hi @gdamore,
First, thanks for taking the initiative in putting together a complete tree-sitter package for D!
That said, please forgive me if I'm a little skeptical about your claims. Looking at the list in this issue's description, many instances of errors produced by the implementation in this repository is not due to defects in the implementation, but genuinely because the source files contain invalid grammar. If you're claiming that they parse successfully, that would indicate that your parser is overly relaxed, and will consume invalid grammatical constructs and present them as valid D programs.
I'm also puzzled by the route you've taken for your project. As I understand, your tree-sitter grammar is hand-written. Was there any reason why you didn't just take the output from this project and added the missing pieces, such as highlighting queries, to produce a complete package usable in editors?
The main distinction between this project and your endeavor is the scope. This project aims to:
The last point is an important one. Tools will go out of date as soon as someone stops actively looking at them. For evidence of how much this affects our community, here is an example: https://forum.dlang.org/post/vfkwhtigkvkmhtzalwaf@forum.dlang.org
As such, I really think your efforts would have much more impact by contributing to the effort started here. Unfortunately due to recent circumstances I've been unable to dedicate as much time to my open source projects as I would have liked, but I remain available for questions and support.
With #2 closed, we now successfully parse all files in DMD's
compilable
test suite. However, there is still lots of valid code out there whichtree-sitter-d
fails to parse successfully.The following is a list of errors which currently result from attempting to parse D source files from the list of D projects used by the community project tester.
Many of these are simply due to implementation bugs in our parser, but some of these may be due to underspecified grammar. Furthermore, since the problem was not detected testing against the DMD
compilable
test suite, each occurrence represents an opportunity to improve the coverage of the DMD suite by adding a reduced sample of the unparseable code to it.For each of the problems below, we should:
ERROR [114, 4] - [114, 34]
ERROR [13, 0] - [239, 1]
ERROR [14, 0] - [1140, 5]
ERROR [60, 4] - [60, 5]
ERROR [81, 43] - [81, 44]
ERROR [15, 0] - [743, 5]
MISSING identifier [45, 29] - [45, 29]
ERROR [29, 4] - [29, 5]
ERROR [270, 0] - [271, 11]
ERROR [58, 2] - [60, 3]
ERROR [188, 0] - [192, 1]
ERROR [11, 10] - [11, 28]
ERROR [17, 10] - [17, 38]
MISSING identifier [29, 22] - [29, 22]
MISSING identifier [11, 17] - [11, 17]
ERROR [9, 0] - [24, 1]
ERROR [48, 0] - [48, 1]
ERROR [0, 0] - [19, 34]
ERROR [13, 4] - [13, 47]
ERROR [7, 4] - [11, 13]
ERROR [15, 22] - [15, 31]
ERROR [7, 13] - [7, 22]
ERROR [10, 4] - [28, 65]
ERROR [0, 0] - [220, 0]
ERROR [149, 92] - [149, 93]
ERROR [69, 39] - [69, 40]
ERROR [177, 38] - [177, 39]
ERROR [0, 0] - [529, 10]
ERROR [910, 4] - [910, 31]
MISSING "." [3218, 109] - [3218, 109]
MISSING "." [8903, 73] - [8903, 73]
ERROR [928, 31] - [928, 43]
MISSING identifier [2582, 29] - [2582, 29]
MISSING identifier [128, 17] - [128, 17]
ERROR [33, 28] - [33, 35]
ERROR [235, 32] - [235, 34]
MISSING identifier [1366, 25] - [1366, 25]
ERROR [1308, 23] - [1308, 28]
MISSING identifier [50, 33] - [50, 33]
ERROR [95, 34] - [95, 36]
ERROR [117, 4] - [117, 37]
ERROR [4846, 4] - [4846, 26]
MISSING ")" [4813, 89] - [4813, 89]
MISSING "." [39, 42] - [39, 42]
ERROR [9, 0] - [800, 1]
ERROR [385, 23] - [385, 24]
ERROR [624, 65] - [624, 66]
ERROR [37, 86] - [37, 87]
ERROR [274, 6] - [274, 7]
ERROR [1716, 39] - [1716, 40]
ERROR [17, 0] - [114, 1]
MISSING ";" [0, 10] - [0, 10]
ERROR [2, 0] - [2, 14]
ERROR [4, 1] - [4, 6]
ERROR [121, 21] - [121, 22]
MISSING identifier [706, 20] - [706, 20]
MISSING identifier [69, 36] - [69, 36]
MISSING identifier [108, 29] - [108, 29]
MISSING ";" [873, 8] - [873, 8]
ERROR [26, 4] - [26, 15]
ERROR [15, 4] - [17, 35]
ERROR [22, 4] - [22, 35]
ERROR [31, 4] - [35, 36]
ERROR [43, 12] - [43, 48]
MISSING "." [897, 29] - [897, 29]
ERROR [10, 0] - [10, 18]
ERROR [16, 0] - [16, 11]
ERROR [18, 0] - [18, 32]
ERROR [348, 18] - [348, 19]
ERROR [22, 0] - [48, 1]
ERROR [13, 0] - [23, 1]
ERROR [537, 17] - [537, 19]
ERROR [20, 0] - [20, 11]
ERROR [16, 0] - [16, 11]
ERROR [208, 33] - [208, 34]
ERROR [78, 4] - [89, 91]
ERROR [74, 0] - [74, 31]
MISSING identifier [89, 16] - [89, 16]
ERROR [19, 8] - [20, 39]
ERROR [16, 0] - [16, 11]
MISSING identifier [393, 44] - [393, 44]
MISSING identifier [393, 33] - [393, 33]
ERROR [179, 4] - [182, 34]
ERROR [136, 23] - [136, 24]
MISSING identifier [906, 22] - [906, 22]
ERROR [130, 4] - [135, 28]
ERROR [69, 0] - [436, 1]
ERROR [359, 30] - [359, 31]
MISSING "." [323, 52] - [323, 52]
ERROR [21, 0] - [114, 1]
ERROR [494, 4] - [494, 5]
ERROR [28, 0] - [124, 5]
ERROR [28, 0] - [161, 5]
ERROR [204, 4] - [204, 5]
ERROR [269, 4] - [269, 5]
ERROR [611, 8] - [611, 9]
ERROR [18, 0] - [424, 5]
ERROR [654, 8] - [654, 9]
ERROR [220, 8] - [220, 9]
ERROR [149, 8] - [149, 9]
ERROR [139, 4] - [139, 5]
ERROR [13, 0] - [271, 5]
ERROR [19, 0] - [581, 1]
ERROR [30, 0] - [270, 5]
ERROR [35, 0] - [851, 5]
ERROR [61, 4] - [61, 5]
ERROR [20, 0] - [600, 5]
ERROR [19, 0] - [414, 5]
ERROR [33, 0] - [451, 5]
ERROR [49, 0] - [300, 1]
ERROR [22, 0] - [221, 5]
ERROR [666, 15] - [666, 20]
ERROR [831, 41] - [831, 46]
ERROR [47, 0] - [1964, 1]
ERROR [260, 55] - [260, 60]
MISSING identifier [548, 32] - [548, 32]
ERROR [341, 14] - [343, 10]
ERROR [741, 8] - [759, 23]
ERROR [9, 0] - [670, 2]
ERROR [52, 2] - [52, 3]
ERROR [116, 2] - [116, 3]
ERROR [56, 2] - [56, 3]
ERROR [106, 2] - [106, 3]
ERROR [172, 2] - [172, 3]
ERROR [132, 2] - [132, 3]
ERROR [117, 2] - [117, 3]
ERROR [110, 2] - [110, 3]
ERROR [51, 2] - [51, 3]
ERROR [223, 2] - [223, 3]
ERROR [72, 2] - [72, 3]
ERROR [320, 2] - [320, 3]
ERROR [63, 2] - [63, 3]
ERROR [143, 2] - [143, 3]
ERROR [98, 2] - [98, 3]
ERROR [111, 2] - [111, 3]
ERROR [124, 1] - [124, 2]
ERROR [275, 1] - [275, 2]
ERROR [72, 2] - [72, 3]
ERROR [183, 2] - [183, 3]
ERROR [99, 2] - [99, 3]
ERROR [63, 2] - [63, 3]
ERROR [69, 2] - [69, 3]
ERROR [180, 2] - [180, 3]
ERROR [170, 2] - [170, 3]
ERROR [127, 2] - [127, 3]
ERROR [482, 2] - [482, 3]
ERROR [180, 2] - [180, 3]
ERROR [202, 2] - [202, 3]
ERROR [93, 2] - [93, 3]
ERROR [73, 2] - [73, 3]
ERROR [183, 2] - [183, 3]
ERROR [95, 2] - [95, 3]
ERROR [7, 0] - [297, 1]
ERROR [211, 2] - [211, 3]
ERROR [117, 2] - [117, 3]
ERROR [109, 5] - [109, 6]
ERROR [119, 2] - [119, 3]
ERROR [123, 1] - [123, 2]
ERROR [314, 5] - [314, 6]
ERROR [175, 6] - [175, 7]
ERROR [20, 0] - [418, 1]
ERROR [20, 0] - [693, 1]
ERROR [0, 0] - [0, 11]
ERROR [0, 27] - [0, 53]
ERROR [0, 0] - [0, 13]
ERROR [4, 0] - [22, 1]
ERROR [2, 0] - [13, 1]
ERROR [5, 12] - [5, 16]
ERROR [7, 0] - [11, 1]
ERROR [4, 0] - [7, 1]
ERROR [2, 0] - [10, 1]
ERROR [4, 0] - [12, 1]
ERROR [4, 1] - [4, 6]
ERROR [2, 0] - [6, 1]
ERROR [2, 0] - [6, 1]
ERROR [0, 0] - [10, 1]
ERROR [2, 0] - [2, 4]
ERROR [2, 0] - [10, 1]
ERROR [15, 0] - [23, 1]
ERROR [14, 0] - [22, 1]
ERROR [0, 0] - [0, 1]
ERROR [0, 0] - [0, 16]
ERROR [5, 0] - [15, 1]
ERROR [6, 0] - [10, 1]
ERROR [2, 0] - [6, 1]
ERROR [5, 0] - [9, 1]
ERROR [15, 1] - [15, 3]
ERROR [4, 1] - [4, 2]
ERROR [0, 0] - [0, 10]
ERROR [0, 0] - [0, 3]
ERROR [2, 14] - [2, 21]
ERROR [0, 0] - [0, 17]
ERROR [1, 30] - [1, 68]
ERROR [1, 37] - [1, 78]
ERROR [1, 37] - [1, 78]
ERROR [0, 0] - [3, 1]
ERROR [4, 0] - [10, 1]
ERROR [3, 1] - [3, 7]
ERROR [3, 1] - [3, 3]
ERROR [11, 1] - [11, 7]
ERROR [5, 0] - [8, 1]
ERROR [2, 0] - [5, 1]
ERROR [0, 23] - [0, 29]
ERROR [0, 19] - [0, 23]
ERROR [0, 0] - [1, 0]
ERROR [0, 24] - [0, 26]
ERROR [0, 7] - [0, 37]
ERROR [0, 15] - [0, 17]
ERROR [0, 0] - [0, 48]
ERROR [6, 12] - [6, 16]
ERROR [2, 1] - [2, 5]
ERROR [0, 0] - [0, 8]
ERROR [0, 21] - [0, 25]
MISSING ";" [7, 1] - [7, 1]
ERROR [0, 0] - [1, 0]
ERROR [0, 76] - [0, 77]
ERROR [0, 0] - [0, 56]
ERROR [0, 63] - [0, 70]
MISSING identifier [3, 20] - [3, 20]
ERROR [0, 0] - [0, 23]
ERROR [0, 11] - [0, 16]
ERROR [0, 0] - [0, 14]
ERROR [0, 23] - [0, 25]
ERROR [0, 38] - [0, 42]
ERROR [0, 54] - [0, 64]
ERROR [0, 50] - [0, 83]
ERROR [0, 0] - [0, 13]
ERROR [5, 0] - [12, 1]
ERROR [7, 1] - [7, 3]
ERROR [0, 10] - [0, 48]
ERROR [0, 0] - [0, 41]
ERROR [0, 0] - [0, 17]
ERROR [0, 18] - [0, 60]
ERROR [0, 21] - [0, 37]
ERROR [0, 0] - [0, 2]
ERROR [0, 57] - [0, 58]
ERROR [10, 0] - [14, 1]
ERROR [15, 0] - [19, 1]
ERROR [0, 12] - [0, 13]
ERROR [0, 31] - [0, 34]
ERROR [0, 12] - [0, 13]
ERROR [0, 0] - [1, 0]
ERROR [0, 17] - [0, 19]
ERROR [0, 0] - [0, 9]
ERROR [2172, 32] - [2172, 53]
ERROR [0, 9] - [0, 16]
ERROR [4, 17] - [4, 18]
ERROR [0, 19] - [0, 41]
ERROR [2, 8] - [2, 9]
ERROR [0, 0] - [0, 10]
ERROR [2, 2] - [2, 3]
ERROR [0, 0] - [0, 10]
MISSING ")" [2, 57] - [2, 57]
MISSING identifier [3, 0] - [3, 0]
ERROR [0, 21] - [0, 23]
ERROR [5, 0] - [14, 5]
ERROR [2, 8] - [2, 18]
ERROR [0, 0] - [0, 8]
ERROR [0, 0] - [0, 11]
ERROR [0, 0] - [1, 0]
ERROR [0, 0] - [0, 6]
ERROR [0, 0] - [0, 10]
ERROR [0, 0] - [0, 5]
ERROR [0, 0] - [1, 0]
ERROR [0, 0] - [0, 8]
MISSING "abstract" [0, 0] - [0, 0]
ERROR [0, 0] - [0, 50]
ERROR [0, 10] - [0, 11]
ERROR [0, 24] - [0, 25]
ERROR [0, 0] - [0, 56]
ERROR [6, 8] - [6, 14]
ERROR [2, 4] - [2, 5]
ERROR [0, 0] - [1, 5]
ERROR [0, 0] - [26, 0]
ERROR [0, 0] - [9, 0]
MISSING identifier [2, 10] - [2, 10]
ERROR [2, 4] - [2, 9]
ERROR [0, 24] - [0, 25]
ERROR [0, 11] - [0, 32]
ERROR [1, 0] - [1, 19]
ERROR [21, 57] - [21, 58]
MISSING identifier [15, 11] - [15, 11]
ERROR [0, 0] - [8, 1]
ERROR [8, 11] - [8, 17]
ERROR [0, 0] - [24, 1]
ERROR [48, 24] - [48, 25]
ERROR [2, 4] - [23, 20]
ERROR [0, 0] - [26, 0]
ERROR [1, 25] - [1, 27]
ERROR [6, 15] - [6, 21]
ERROR [68, 11] - [68, 20]
ERROR [4, 6] - [4, 7]
ERROR [0, 0] - [9, 31]
ERROR [10, 8] - [10, 9]
ERROR [0, 0] - [3, 23]
ERROR [0, 40] - [0, 52]
ERROR [11, 0] - [11, 1]
MISSING identifier [2, 0] - [2, 0]
ERROR [6, 1] - [6, 9]
ERROR [6, 0] - [7, 1]
ERROR [0, 0] - [10, 0]
ERROR [10, 0] - [10, 4]
MISSING identifier [5, 9] - [5, 9]
ERROR [0, 0] - [21, 0]
ERROR [1, 0] - [1, 4]
ERROR [0, 0] - [15, 0]
ERROR [13, 0] - [13, 4]
ERROR [153, 27] - [153, 28]
ERROR [8, 27] - [8, 28]
MISSING "." [111, 59] - [111, 59]
MISSING "." [343, 54] - [343, 54]
MISSING "." [107, 32] - [107, 32]
ERROR [8, 8] - [8, 16]
ERROR [10, 8] - [10, 16]
ERROR [10, 8] - [10, 14]
MISSING "." [315, 22] - [315, 22]
ERROR [27, 0] - [533, 1]
MISSING "." [344, 54] - [344, 54]
MISSING "." [107, 32] - [107, 32]
ERROR [6, 8] - [6, 16]
ERROR [8, 8] - [8, 16]
ERROR [8, 8] - [8, 14]
MISSING "." [313, 22] - [313, 22]
ERROR [188, 8] - [188, 25]
ERROR [104, 0] - [104, 4]
ERROR [345, 16] - [345, 21]
ERROR [179, 83] - [179, 84]
MISSING "." [179, 30] - [179, 30]
MISSING "." [583, 36] - [583, 36]
ERROR [245, 5] - [245, 6]
ERROR [175, 4] - [175, 5]
ERROR [434, 4] - [434, 5]
ERROR [66, 1] - [66, 33]
ERROR [837, 18] - [837, 21]
ERROR [6, 0] - [8, 1]
ERROR [4, 0] - [191, 29]
ERROR [2, 0] - [90, 1]
ERROR [2, 0] - [86, 9]
ERROR [0, 0] - [88, 10]
ERROR [56, 4] - [56, 14]
ERROR [2, 0] - [406, 1]
ERROR [2, 0] - [76, 1]
ERROR [2, 0] - [218, 1]
ERROR [87, 0] - [832, 1]
ERROR [2172, 9] - [2173, 15]
ERROR [2, 0] - [45, 63]
ERROR [54, 0] - [1067, 1]
ERROR [485, 39] - [485, 41]
MISSING ";" [91, 5] - [91, 5]
MISSING ";" [272, 5] - [272, 5]
ERROR [98, 50] - [98, 61]
MISSING "." [43, 21] - [43, 21]
MISSING "." [75, 33] - [75, 33]
ERROR [0, 0] - [36, 0]
ERROR [29, 0] - [105, 1]
ERROR [7, 0] - [42, 1]
ERROR [7, 0] - [158, 1]
ERROR [42, 0] - [42, 1]
ERROR [8, 0] - [134, 1]
ERROR [2, 0] - [78, 1]
ERROR [0, 0] - [41, 0]
ERROR [605, 12] - [608, 19]
ERROR [0, 0] - [27, 0]
MISSING identifier [549, 20] - [549, 20]
ERROR [139, 29] - [139, 38]
MISSING identifier [1989, 28] - [1989, 28]
ERROR [0, 0] - [32, 0]
ERROR [7, 0] - [310, 1]
ERROR [99, 0] - [294, 1]
ERROR [36, 0] - [36, 1]
ERROR [244, 5] - [244, 6]
ERROR [105, 12] - [105, 24]
ERROR [49, 8] - [49, 20]
ERROR [35, 0] - [678, 1]
ERROR [10, 0] - [372, 1]
ERROR [10, 0] - [432, 1]
ERROR [91, 10] - [91, 22]
ERROR [148, 8] - [148, 20]
ERROR [112, 8] - [112, 20]
ERROR [100, 8] - [100, 20]
ERROR [117, 8] - [117, 20]
ERROR [100, 8] - [100, 20]
ERROR [100, 10] - [100, 22]
ERROR [101, 10] - [101, 22]
ERROR [100, 10] - [100, 22]
ERROR [101, 10] - [101, 22]
ERROR [143, 10] - [143, 22]
ERROR [48, 0] - [48, 1]
ERROR [731, 0] - [731, 3]
ERROR [818, 37] - [818, 38]
ERROR [52, 0] - [52, 2]
ERROR [37, 0] - [1040, 1]
ERROR [17, 0] - [793, 1]