Open uprokevin opened 1 year ago
Code to update is here
Need to fetch existing docstring: add some basic to merge it: New Docstring <> Existing ones. (string comparison)
Hi! Thanks for the suggestion :)
My idea was to provide two options:
But what you are saying seems reasonable to me. What I think I'll do is to create another configuration:
Yes, having different updating mode is good:
update_mode = None ## only new docstring
update_mode = "overtwrite" ## Dangerous: overwrite all docstring...
update_mode = "append" ## Merge Old Docstring at bottom of new generated.
update_mode = "merge" ## Merge New and Old in a smart way...
You can think of adding docstring in AN existing codebase....
@MichaelisTrofficus :
Any thoughts on implementing this ?
Hello! Sorry for the delay, I've been working a lot for other projects during summer and I couldnt put any time to this project. But I'm back so I'll be implementing these features and also additional ones (translating between docstrings styles e.g. from numpydocs into google).
Thanks ! Because of of the python code have already some docstring, so merging wuth existing will help not to lose information….
On Oct 9, 2023, at 18:40, Miguel Otero Pedrido @.***> wrote:
Hello! Sorry for the delay, I've been working a lot for other projects during summer and I couldnt put any time to this project. But I'm back so I'll be implementing these features and also additional ones (translating between docstrings styles e.g. from numpydocs into google).
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.
Hi @arita37! I don't know if you've been keeping track of the new versions of the library, but there have been a lot hahah Right now, you have two options when generating docstrings.
The first option is to update the file in place (overwriting the file basically). This option will only generate docstrings when no docstrings are found for the class / function in hand.
The second option does not overwrite the file, but it creates a git patch, with all the proposed docstrings. I have added detailed documentation in the README.md if you want to take a look at it.
My idea is that, by having this patch file, the developer will be able to decide which changes he wants to add to the original file.
(Btw, there's also another experimental feature, which let's you translate between docstring styles, e.g. from numpy docstrings into google docstrings; I've also added documentation about it in the Example section of the README.md)
Let me know what you think 👍
Hello,
Thanks for it.
For my part, will not use it since it increas my workfload… goal of the tool is automate as much as possible manual steps.
In reality, what happens is:
Developer has already codebase with in-complete docstring (or some arguments are missing).
Better to directly add GPT docstring as append to existing one.
Think appending should not be too difficult…(?)
On Oct 30, 2023, at 17:44, Miguel Otero Pedrido @.***> wrote:
Hi @arita37! I don't know if you've been keeping track of the new versions of the library, but there have been a lot hahah Right now, you have two options when generating docstrings.
The first option is to update the file in place (overwriting the file basically). This option will only generate docstrings when no docstrings are found for the class / function in hand.
The second option does not overwrite the file, but it creates a git patch, with all the proposed docstrings. I have added detailed documentation in the README.md if you want to take a look at it.
My idea is that, by having this patch file, the developer will be able to decide which changes he wants to add to the original file.
(Btw, there's also another experimental feature, which let's you translate between docstring styles, e.g. from numpy docstrings into google docstrings; I've also added documentation about it in the Example section of the README.md)
Let me know what you think 👍
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.
But then you'll also have to reformat the docstring right? I mean, if you have an incomplete docstring and I append the new docstring to the existing one, you'll still need to reformat the docstring. The other option is to send the funtion with your incomplete docstring and use that in the prompt. Could you send me an example pls? Bc I think that's the way to go. Take your already in place information and enrich it with gpt4docstrings.
Correcting append in same file/repo is faster than managing git merge on 2 repo (one original and one with new docstring) !
On Oct 30, 2023, at 20:10, Miguel Otero Pedrido @.***> wrote:
But then you'll also have to reformat the docstring right? I mean, if you have an incomplete docstring and I append the new docstring to the existing one, you'll still need to reformat the docstring. The other option is to send the funtion with your incomplete docstring and use that in the prompt. Could you send me an example pls? Bc I think that's the way to go. Take your already in place information and enrich it with gpt4docstrings.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.
We can add small text similarity to prevent near-duplicate appending: (threshold based):
https://pypi.org/project/textdistance/
On Oct 30, 2023, at 20:10, Miguel Otero Pedrido @.***> wrote:
But then you'll also have to reformat the docstring right? I mean, if you have an incomplete docstring and I append the new docstring to the existing one, you'll still need to reformat the docstring. The other option is to send the funtion with your incomplete docstring and use that in the prompt. Could you send me an example pls? Bc I think that's the way to go. Take your already in place information and enrich it with gpt4docstrings.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.
I was also thinking about relying on gpt-3.5-turbo for this. For example, suppose I have this function:
def dummy(a, b):
"""
This function sums two integers. It will also raise
MyCustomException if `a` is bigger than 100.
"""
return a + b
If I ask gpt-3.5-turbo to create numpy docstrings for this function, it will already take into consideration the provided information. In this case, I'll get this:
def dummy(a, b):
"""
Sum two integers.
This function takes two integer inputs `a` and `b` and returns their sum. It also includes exception handling
to raise `MyCustomException` if `a` is greater than 100.
Parameters
----------
a : int
The first integer to be added.
b : int
The second integer to be added.
Returns
-------
int
The sum of `a` and `b.
Raises
------
MyCustomException
If `a` is greater than 100.
Examples
--------
>>> dummy(10, 20)
30
>>> dummy(110, 5)
Traceback (most recent call last):
...
MyCustomException: 'a' is greater than 100
"""
if a > 100:
raise MyCustomException("'a' is greater than 100")
return a + b
Ok, makes sense to ask GPt3.5 to include existing docstring.
Think we need to customize the prompt to make explicit integration.
In that way, user manual task is limited.
Just need to confirm If user wants to integrate existing docsting.
On Oct 30, 2023, at 20:25, Miguel Otero Pedrido @.***> wrote:
I was also thinking about relying on gpt-3.5-turbo for this. For example, suppose I have this function:
def dummy(a, b): """ This function sums two integers. It will also raise MyCustomException if
a
is bigger than 100. """ return a + b If I ask gpt-3.5-turbo to create numpy docstrings for this function, it will already take into consideration the provided information. In this case, I'll get this:def dummy(a, b): """ Sum two integers.
This function takes two integer inputs `a` and `b` and returns their sum. It also includes exception handling to raise `MyCustomException` if `a` is greater than 100. Parameters ---------- a : int The first integer to be added. b : int The second integer to be added. Returns ------- int The sum of `a` and `b. Raises ------ MyCustomException If `a` is greater than 100. Examples -------- >>> dummy(10, 20) 30 >>> dummy(110, 5) Traceback (most recent call last): ... MyCustomException: 'a' is greater than 100 """ if a > 100: raise MyCustomException("'a' is greater than 100") return a + b
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.
Any movement on this? I'd also like to see the feature of adding the existing docstring as input for the GPT model to factor into its final output.
Thanks for great library
Instead of replacing existin docstring, please do this:
docnew =.doc_gpt4 + "\n" + doc_existing
So, user can decide after to keep his own docstring info. We often put valuable infos in docstring: code sample.