Closed masifmunir closed 5 months ago
Hi @masifmunir, nice to meet you.
Although you've filed this issue in the SDMetrics library, the code you are using is from the SDV library. Just want to confirm that you are using sdv
not sdmetrics
in your import statement?
The sdv
module does not have any function called evaluate
. May I ask which documentation or guide you are referring to?
The latest documentation and tutorials for SDV can be found at https://docs.sdv.dev/sdv. Here are some recommended resources for getting started:
Thanks.
my sdv version: 1.11.0 but getting error to use: from sdv.evaluation import evaluate evaluate(new_data, data) reference: https://www.kdnuggets.com/2022/03/generate-tabular-synthetic-dataset.html
Hi @masifmunir thanks for sharing the link. As the article you're referencing is over 2 years old, much of the API and instructions it is recommending is old and out-of-date. For example, there is no longer an evaluate
module. You should do this instead:
from sdv.evaluation.single_table import evaluate_quality
As mentioned in the docs
For the most up-to-date API and usage instructions, please refer to the links above in my previous comment. Let us know if you have any questions.
Thank you for your kind guidance. Will you please further guide as: I have synthetic dataset to evaluate which ia in numerical table form without metadata. Appropriate guidance steps please. Have a nice day. Regards,
On Wed, Mar 27, 2024, 7:25 PM Neha Patki @.***> wrote:
Hi @masifmunir https://github.com/masifmunir thanks for sharing the link. As the article you're referencing is over 2 years old, much of the API and instructions it is recommending is old and out-of-date. For example, there is no longer an evaluate module. You should do this instead:
from sdv.evaluation.single_table import evaluate_quality
As mentioned in the docs https://docs.sdv.dev/sdv/single-table-data/evaluation
For the most up-to-date API and usage instructions, please refer to the links above in my previous comment. Let us know if you have any questions.
— Reply to this email directly, view it on GitHub https://github.com/sdv-dev/SDMetrics/issues/545#issuecomment-2022904552, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIAR2YER64W3MMWH5ZGE3M3Y2LJGHAVCNFSM6AAAAABFIGKV3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRSHEYDINJVGI . You are receiving this because you were mentioned.Message ID: @.***>
Hi @masifmunir my pleasure.
The most recent version of SDMetrics requires you to include metadata. Metadata is meant to be the ground source-of-truth description of your dataset. It is very important because it allows SDMetrics to apply the correct type of evaluation to each column.
For example if you are storing HTTP responses such as 404
(not found) or 500
(server error), these are discrete values. So it is necessary to evaluate them as categorical values, not as a numerical distribution.
To create metadata, you have a few options:
Thanks, I'll check them out.
On Sat, Mar 30, 2024, 1:15 AM Neha Patki @.***> wrote:
Hi @masifmunir https://github.com/masifmunir my pleasure.
The most recent version of SDMetrics requires you to include metadata. Metadata is meant to be the ground source-of-truth description of your dataset. It is very important because it allows SDMetrics to apply the correct type of evaluation to each column.
For example if you are storing HTTP responses such as 404 (not found) or 500 (server error), these are discrete values. So it is necessary to evaluate them as categorical values, not as a numerical distribution.
To create metadata, you have a few options:
- Use SDV to auto-detect the metadata from the data, then inspect and update it to be correct -- see API docs https://docs.sdv.dev/sdv/single-table-data/data-preparation/single-table-metadata-api
- Generate a Python dictionary from scratch using the metadata description -- see SDMetrics docs https://docs.sdv.dev/sdmetrics/getting-started/metadata/single-table-metadata
— Reply to this email directly, view it on GitHub https://github.com/sdv-dev/SDMetrics/issues/545#issuecomment-2027695179, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIAR2YBKB5TZSHAN23AOOQDY2XDWNAVCNFSM6AAAAABFIGKV3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRXGY4TKMJXHE . You are receiving this because you were mentioned.Message ID: @.***>
Hi @masifmunir I'm closing off this issue since we've address the original problem (ImportError
). If you encounter any other problems using SDMetrics, please feel free to file a new issue describing your problem. Thanks.
Thanks for your support.
On Wed, Apr 10, 2024, 11:46 PM Neha Patki @.***> wrote:
Hi @masifmunir https://github.com/masifmunir I'm closing off this issue since we've address the original problem (ImportError). If you encounter any other problems using SDMetrics, please feel free to file a new issue describing your problem. Thanks.
— Reply to this email directly, view it on GitHub https://github.com/sdv-dev/SDMetrics/issues/545#issuecomment-2048216669, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIAR2YBUNJSBYOA747B3ZJTY4WCG3AVCNFSM6AAAAABFIGKV3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYGIYTMNRWHE . You are receiving this because you were mentioned.Message ID: @.***>
Environment details
If you are already running SDMetrics, please indicate the following details about the environment in which you are running it:
Problem description
What I already tried