Closed AlekseySh closed 1 year ago
@AlekseySh
What do you think about the following docstring format? Is it ok, or is it too much?
@dapladoc Hey!
We thought about putting formulas into doc strings. It seems like a trade-off: readability for developers vs readability for users. @DaloroAT, what do you think?
As for code examples, I like the idea, but only if we have a way to automatically test it. Perhaps, there are some existing tools.
PS. In this particular case, I have a feeling that math overcomplicates the understanding of the metric.
What about this? The metric calculates the percentage of positive distances higher than a given q-th percentile of negative distances
Off-topic: But these formulas were useful at least for the following. One of the goals of OML is to perform research and probably publish some results. I am working on experiments for a surrogate precision loss now. The idea is that you can express precision@k using the Heaviside function, then you replace it with sigmoid with temperature and you can differentiate it. I am thinking now about creating a differentiable version of FNMR@FMR thanks to your formulas.
For me, there is no difference between developers and users in terms of OML. Everyone in OML is the same (in contrast - torch developers and users). I like the suggestion of @dapladoc , we also thought about it previously, but we had a higher priority tasks before the first release.
Probably we don't need details for each object in library (utils, patching, etc), but for hyperparameters - good idea.
Also if we have a long description for function, we can use the following pattern:
def crazy_function(a, b):
...
crazy_function.__doc__ = crazy_function_doc
And then we can test crazy_function_doc
in the same way as we test README snippets.
Or even better.
def crazy_function(a, b):
"""
some actions ...
Args:
a: param1
b: param2
"""
crazy_function.__doc__ = crazy_function.__doc__ + extra_doc_with_examples
Then test extra_doc_with_examples
. It's ok for developers to see little snippet with args and cool for users to see tested example.
@DaloroAT
And where extra_doc_with_examples
should be located? Somewhere in docs/readme/
? Or it is a string right in the same module as crazy_function
?
@dapladoc As far as I remember @DaloroAT's code from another project, he put it nearby the function
@dapladoc @DaloroAT If you want to continue the discussion about a way how we check code examples, let's do it in a dedicated issue https://github.com/OML-Team/open-metric-learning/issues/241. If we agreed on some idea, we can implement it in a dedicated PR.
As for the current scope of work, let's leave the example as @dapladoc suggested here: https://github.com/OML-Team/open-metric-learning/issues/240#issuecomment-1328026114 , but without testing for now
For me following options are good: 1) Keep below function
def func1(a, b):
"""
Obligatory description for func1...
Args:
a: param1
b: param2
"""
...
extra_docs_func1 = \
"""
Formulae for func1...
"""
func1.__doc__ = func1.__doc__ + extra_docs_func1
def func2(c):
"""
Obligatory description for func2...
Args:
c: param1
"""
...
extra_docs_func2 = \
"""
Extra comments with snippets for func2...
"""
func2.__doc__ = func2.__doc__ + extra_docs_func2
2) Move all extra_docs
of each function to a separate file docs.py
on the same level, then import it.
The first approach allows us to keep the context of function in the same place, but the second approach distinguishes computational code and ideas in separate files and keeps them small. For me, 1 option is better.
@DaloroAT , what do you think about latex formulas for the metric?
As one of the solutions, I can suggest keeping both: formulas and a text description from https://github.com/OML-Team/open-metric-learning/issues/240#issuecomment-1328036668 (on the top of the docs). Formulas may be placed nearby in __doc__
, as @DaloroAT suggested.
Latex formulae are the best decision
Of course, we can use a small description (docs in the body of the function) and a detailed description (extra_docs
) at the same time.
I hope, readthedocs allows us to do these tricks :D
@dapladoc could you show the source of the doc string above? I want to understand better how it looks like to decide if we need to split it as @DaloroAT suggested
@AlekseySh
Here are some references that might be helpful https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html https://pytorch.org/docs/stable/generated/torch.pca_lowrank.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html?highlight=pca#sklearn.decomposition.PCA
@dapladoc thanks for sharing. Yes, it's a bit tricky to read, but it's the same in other libraries. Thus, I am not sure that we can benefit from splitting docs into several parts via using __doc__
. @DaloroAT , what would you say?
Ok, let's keep full docs in functions
Okay, then, it's time to create a PR and do the same and in the same format for the rest of the functions as @dapladoc did for FNMR
The scope of work: