Closed atisharma closed 6 months ago
Perplexing. It looks like Python wants to try to reparse the source file (as Python) even though all the docstrings it wants should already exist at runtime. I'm not sure Hy can fix this, but I'll look into it.
It seems to occur whenever a Hy source file contains a defclass
with no docstring.
There it is:
If there is no docstring, pydoc [
pydoc.help
being called byhelp
] tries to obtain a description from the block of comment lines just above the definition of the class, function or method in the source file, or at the top of the module (seeinspect.getcomments()
).
inspect.getcomments
says "If the object's source code is unavailable, return None
." So, short of monkey-patching inspect.getcomments
to actually get Hy comments, which is a tall order, perhaps we can at least get it to return None
when the source file is in Hy.
Thank you for looking at that. Unfortunately, it breaks a use case where I use inspect.getsourcefile
for Hy objects. This allows me to print the source of a function or other object in the repl, which I find very helpful.
Since you are monkey-patching inspect anyway, could I suggest that monkey-patching inspect.getcomments
might be the better solution? I think it would not be too difficult to write a regex to extract comments, and improving the functionality of the inspect module with Hy would have other benefits - e.g. #1696 #1680 #1397.
If you're agreeable to that approach, I'll work on a patch. I'm aware you would want a test.
I think it would not be too difficult to write a regex to extract comments
I'm not so sure. Hy's syntax is Turing-undecidable (because of reader macros); it has complex nested structures (such as bracket f-strings) that are difficult to match properly with regexes; and macros can produce code with bogus position information. But the proof is in the pudding.
I think I have sorted this out by patching inspect. This would also solve #1696. See https://github.com/atisharma/hy/blob/inspect-compat/hy/_compat.py which fixes:
If this approach is acceptable then I'll finish writing tests and proceed. If you want changes (e.g. it's quite long - would you prefer hy._inspect.py?) or if I'm barking up the wrong tree, then of course let me know.
I think calling hy_compile
or read_many
is too much, because it would mean that help
can execute arbitrary code, which is a big change from the situation in plain Python.
I haven't tested your hy_getcomments
, but it looks like it could be fooled with plain string literals, bracket strings, reader macros, etc. It may well still be better than nothing; the thing to do is probably just to ensure it will return the wrong answer or None
instead of raising an error.
I think calling
hy_compile
orread_many
is too much, because it would mean thathelp
can execute arbitrary code, which is a big change from the situation in plain Python.
I see your point, but doesn't calling help on an object mean the object has already been through the reader once, since it's in the namespace? There is no way to locate a class without either following the syntax tree (which is what python's inspect._ClassFinder does) or guessing with regexes, which has its own problems, as you pointed out. I don't see any other way.
I haven't tested your
hy_getcomments
, but it looks like it could be fooled with plain string literals, bracket strings, reader macros, etc. It may well still be better than nothing; the thing to do is probably just to ensure it will return the wrong answer orNone
instead of raising an error.
This is expected behaviour. Python's inspect.getcomment only shows comments with a hash # like this
before the class or function definition, and does not show strings. I would expect hy_getcomments
to also ignore string literals, bracket strings in the same way, and reader macros too, since they are not comments. And as you say, it's better than returning None
.
Since all the functions rely on findsource
; getcomment
will not work without it.
I do appreciate that patching inspect is not to be taken lightly.
doesn't calling help on an object mean the object has already been through the reader once, since it's in the namespace?
Sure, but one of the things about arbitrary code is that running it can have a different effect the second time. Besides, there's no guarantee that the file hasn't changed, so you might execute different code, after all.
I don't see any other way.
You could change the compiler to cache what you need when the code is originally compiled, perhaps. But simply not implementing this is probably reasonable.
I would expect
hy_getcomments
to also ignore string literals, bracket strings in the same way, and reader macros too, since they are not comments.
When I say "fooled with", I mean that those things could be mistaken for comments by your text-processing code, not that the code ought to treat them as comments and fails to. I could construct an example if this isn't clear.
Well, the user has already made the decision to trust the code by that point, but I can see why you're not keen.
I can implement caching the python ast for use in findsource
but modifying the compiler seems rather brutal. I'll have a think about possible other approaches, maybe like a HyReader
without reader macros.
In the meantime, your patch is lighter touch and fixes the help
problem. I can get around my getsource
problem just in that module.
I also have another PR you might like better, which patches pydoc.getdoc
rather than anything in inspect
. To be uploaded shortly.
I came across a strange error causing problems viewing the docstring of a module. Below is my setup and a minimum working example.
results in the following error:
Commenting out the class definition removes the error. Removing the docstring preserves the error.