Open LouisJenkinsCS opened 5 years ago
@LouisJenkinsCS @Lydia-duncan wouldn't that need the code to be compiled every time a type is declared?? As far as I have used the compiler, it takes a lot of time to compile the code. Also, it won't work for incomplete code like
var x : int;
var y = x;
var z = (x + y) : real(32);
for i in
Think of the case when user leaves the for i in
the bottom of code incomplete like this, and goes one line above to declare a variable he forgot. The compiler will give a syntax error
.
Still, if you guys think I should try it, I would do it ??
My suggestion would be to throttle requests to re-compile to once every 5 seconds.
Also compilation is slow but you can use something like --stop-after-pass resolve
As in you could cache these results and only regenerate it once every so often.
As for incomplete code: if the compiler fails, prior to the "resolve" pass, don't discard previous results.
Seen, getting myself up to speed on the implementation.
I don't think compilerWarning would gain you anything in this situation beyond just running the compiler - it gets evaluated at the same time as resolution runs, so it would be equivalent to:
var y: x.type = x;
I'm not completely clear on the approach. Is implementing type inference yourself the alternative to relying on the Chapel compiler for type information?
In terms of how to avoid certain lines causing failures that prevent information about other lines, you could take the approach of dropping lines that cause early failures (i.e. if it gives a syntax error, drop that line and recompile to see if you get further in compilation, with resolution being the further I would go when other errors are present)
I guess I misworded my question. I meant inquiring the inferred type from the compiler. Is there a way to get the compiler to dump all type information in a readable and predictable format?
The goal would be to allow the type of a variable to be shown on mouse-hover.
Ah, probably the --log
compilation flag is what you're looking for (though it won't generate output for the particular pass that failed iirc). You can ask --log
to only handle specific passes (see compiler/main/runpasses.cpp for quick ways to check specific passes) or can have it output for every pass. The file stored will have a lot of details about the AST, but you will likely be able to get any of the information you need from it, so long as it has been computed by that particular pass. I'm happy to give a run-down of what any of it means in a video call
While looking through --log
for 'resolve' pass, I finally see type information that can possibly be parsed.
unknown x[185464]:int(64)[10] "insert auto destroy"
unknown y[185506]:int(64)[10] "insert auto destroy"
unknown call_tmp[546537]:int(64)[10] "expr temp" "maybe param" "temp"
unknown call_tmp[546542]:real(32)[110] "maybe param" "temp"
unknown z[185557]:real(32)[110] "insert auto destroy"
It looks like the above. I'm wondering if this can be used to query the type of the variables rather than explicit instrumentation.
All type information should be known at the end of resolve (assuming it succeeds). The [10]
ids can also be used to link user defined types to their definitions (left as an exercise to the reader ;) ).
@lydia-duncan for the following code,
var s:real(32);
s = 40;
proc fname() {
var i: int;
var k = i;
writeln("Hello");
}
and compilation command chpl test.chpl --log-pass r --stop-after-pass resolve
, the test_13resolve.ast
shows
AST dump for 3 after pass resolve.
{
function chpl__init_3[300152]() : void[4] "insert line file info" "module init" "resolved"
{
unknown call_tmp[522818]:real(32)[109] "expr temp" "maybe param" "temp" "type variable"
(388139 'move' s[178236](818193 call _defaultOf[818196]))
unknown coerce_tmp[818397]:real(32)[109] "coerce temp" "insert auto destroy" "temp"
(818403 'move' coerce_tmp[818397](818400 call _cast[818405] 40))
(178241 call =[315755] s[178236] coerce_tmp[818397])
(372643 return _void[43])
}
unknown s[178236]:real(32)[109]
function main[561783]() : void[4] "resolved"
{
(561786 return _void[43])
}
function chpl_gen_main[561789](arg _arg[561788]:chpl_main_argument[159399]) : int(64)[10] "compiler generated" "export" "generated main" "local args" "resolved"
{
val global_temp[912890]:domain(1,int(64),false)[714289] "temp"
val global_temp[909925]:string[29959] "temp"
val ret[561824]:int(64)[10] "RVV" "temp"
val _main_ret[561793]:int(64)[10] "temp"
unknown _endCount[561794]:unmanaged _EndCount(AtomicT(int(64)),int(64))[643598] "temp"
(561800 'move' _endCount[561794](561797 call _endCountAlloc[631389]))
(561802 'set dynamic end count' _endCount[561794])
(561804 call chpl_rt_preUserCodeHook[159570])
(561806 call chpl__init_3[300152])
(561808 call main[561783])
(561810 'move' _main_ret[561793] 0)
(561813 call chpl_rt_postUserCodeHook[159576])
unknown coerce_tmp[818975]:_EndCount(AtomicT(int(64)),int(64))[642983] "coerce temp" "insert auto destroy" "temp"
(818980 'move' coerce_tmp[818975](818978 'cast' _EndCount(AtomicT(int(64)),int(64))[642983] _endCount[561794]))
(561815 call _waitEndCount[818509] coerce_tmp[818975])
(561818 call chpl_deinitModules[159613])
(561829 'move' ret[561824] _main_ret[561793])
(561826 return ret[561824])
}
}
Except for finding out the type of s
, I can't find anything else useful in it :disappointed: . Could you help me in finding out how to find out the type
of variables declared inside the proc fname()
?
@lydia-duncan is there any specific way you would recommend on parsing these .ast
files?
Grep for strings that match "%VARIABLE_NAME[\d+]:%VARIABLE_TYPE[\d+]", stick this in a symbol table, use that.
Oh misread your question, it's about finding variables inside of functions. I'll see if I can obtain a solution real quick.
Found the issue: The function gets eliminated as its never used (and likely never needs to be resolved.) @lydia-duncan --no-dead-code-elimination
doesn't seem to work either, it only gets resolved if I invoke fname()
somewhere.
@AnubhavUjjawal I'd recommend for the time being that you try to move past this issue for now and try to obtain type information for the variables you can find.
Yup, if the function isn't called, it doesn't get resolved. Some of the compiler passes go through every symbol of a particular type (all classes, all functions, all modules, etc), while others only follow control flow through the program. Resolution only starts at the main function and follows that path, and then cleans up symbols that were not used. This allows us to avoid things like stamping out a copy of a generic function for every possible type we know about, but we've definitely had latent bugs in our code due to this.
I would recommend indicating to the user in some way if their code hasn't been resolved due to not being called. It is beyond the scope of this project to change the compiler so that those functions would get resolved as well, but we can detect by its absence that it hasn't been used and so could add a warning ("there may be errors here we can't see yet, please call the function" or something).
@LouisJenkinsCS @lydia-duncan I was just wondering if I could find the type
directly you know, like if user is hovering over a variable
, instead of compiling the code, just use regular expressions to find the declaration??
My worry is that you would be reinventing the wheel, when we already have a solution. Any implementation you make that tries to recreate something the compiler already does without relying on the compiler itself has the potential for code drift and a maintenance burden.
For instance, when determining the type, what would you do if the user program had two type declarations with the same name? To solve that, you'd be reinventing scope resolution and reinventing the code to follow our use statements. What if our strategy for either of those changes (as it has recently due to rethinking our point of instantiation rules)? Your output will be incorrect, or you'll get a test failure and have to figure out how to adjust it in a similar fashion.
You could 'cheat it' a bit by appending function calls to every single function at the very end of the file. Then use --ignore-errors
to handle cases where compilerError
gets explicitly invoked. Worth a shot.
That'll work for functions with no arguments, but not for functions with arguments. What would you insert for the call when the argument is generic? Or when the argument is a complicated type? I think having a warning so the user can insert appropriate calls is the best strategy.
@lydia-duncan @LouisJenkinsCS using the compiler output, I am able to get this . There is still lot to implement, which I would do during the coding period. Currently I am writing some hacky codes as a proof of concept.
Currently I am going to start writing a proposal on this, and would ask some concept questions on the same thread, wether they are implementable or not ? I believe as @lydia-duncan said,
The file stored will have a lot of details about the AST, but you will likely be able to get any of the information you need from it, so long as it has been computed by that particular pass.
I would be able to implement most of the functionalities.
@LouisJenkinsCS @lydia-duncan If you would like me to do some more implementations as proof of concepts, please tell some.
@AnubhavUjjawal Can you discard everything but the type? That'd be more than sufficient.
@AnubhavUjjawal Can you discard everything but the type? That'd be more than sufficient.
Yeah ok, did it. As I said, I didn't worry about the details shown in the hover but just made an implementation to make the type inference work.
I believe that you can possibly implement type inference by taking advantage of
compilerWarning
to obtain the type. For example, imagine you have something like this...How would you get this type information? Like this.
TIO
I suggest that the LSP could possibly instrument the original program by adding this to all variable declarations to query their type, which can then be parsed.
@lydia-duncan Any opinions on this approach? Should be a nice and decent short-term solution.