argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
3.86k stars 361 forks source link

[DOCS] review flow within docs for `TextDescriptivesExtractor` #4438

Closed davidberenstein1957 closed 9 months ago

davidberenstein1957 commented 9 months ago

Which page or section is this issue related to?

The docs for the TextDescriptivesExtractor, are not as smooth as expected and can be improved a bit.

dosubot[bot] commented 9 months ago

🤖

Hello @davidberenstein1957! 👋

I'm Dosu, an AI bot here to help you with the Argilla repository while awaiting the assistance of a human maintainer. I'm equipped to help you solve bugs, answer questions, and guide you on your journey to becoming a contributor.

Let's make your experience with Argilla as smooth as possible! 😊

Thank you for your feedback on the documentation for the TextDescriptivesExtractor. I agree that there are areas where we can improve the clarity and flow of the information. Here are some suggestions:

  1. Overview: We can start with a brief overview of what the TextDescriptivesExtractor class does. This will provide more context about how it fits into the larger Argilla project.

  2. Methods: Each method should have a clear, concise summary of what it does. For example, the _extract_metrics_for_single_field method could explain what it means for a field to be "empty" and why such fields are skipped.

  3. Parameters: Each parameter in the methods should be clearly explained. For instance, basic_metrics in _extract_metrics_for_single_field could use more detail about what they do and what their default values mean.

  4. Return Values: Each method should clearly state what it returns. For example, _cast_to_python_types could use more detail about what the returned dataframe looks like.

  5. Examples: We can include examples of how to use the class and its methods. While there are examples for the update_records and update_dataset methods, we could add examples for the other methods as well.

  6. Error Handling: We can explain what errors can be raised and why. This is not currently done in the documentation.

  7. Dependencies: If the class depends on any external libraries (like pandas or textdescriptives), these should be mentioned in the documentation.

  8. Formatting: We can use consistent formatting for the documentation. This includes things like capitalization, punctuation, and code formatting.

Here's an example of how we might revise the documentation for the __init__ method:

def __init__(
    self,
    model: str = "en",
    metrics: Optional[List[str]] = None,
    fields: Optional[List[str]] = None,
    visible_for_annotators: bool = True,
    show_progress: bool = True,
):
    """
    Initialize a new TextDescriptivesExtractor object.

    Args:
        model (str, optional): The language of the model to use for text descriptives. Default is "en".
        metrics (Optional[List[str]], optional): A list of metrics to extract. Options include 
            ["descriptive_stats", "readability", "dependency_distance", "pos_proportions", "coherence", "quality", "information_theory"].
            If None, all metrics will be extracted. Default is None.
        fields (Optional[List[str]], optional): A list of field names to extract metrics from. If None, all fields will be used. Default is None.
        visible_for_annotators (bool, optional): Whether the extracted metrics should be visible to annotators. Default is True.
        show_progress (bool, optional): Whether to show a progress bar when extracting metrics. Default is True.
    """

This revised documentation provides more detail about what each parameter does and what their default values mean. I hope these suggestions are helpful. If you have any other ideas or feedback, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.