Closed Pavamana15 closed 2 months ago
Hi @Pavamana15 thanks for filing this issue and providing more details. I think the error message is a bit misleading. The root cause of the issue is that you are supplying the same column (Name
) as both the sequence key and a context column.
For multi-sequence data, it is not allowed for your context column to be the same as your sequence key. The sequence key is meant to identify each sequence so by definition, it will never vary within each sequence. However a context column is typically another column (not the sequence key) that remains constant within a sequence. Removing the context column should fix your issue.
Resources:
Thank you so much @npatki . The error is resolved now,but it is taking a lot of time to run
@npatki I could generate synthetic data using PARSynthesizer with fewer epochs, i.e., 30 epochs. However, it does not generate synthetic data in sequential order. I mean, rows should be ordered in time. Original data had rows ordered in time. So why am I getting like this, or what mistake am I making?
@npatki Can you also explain how to evaluate the quality of synthetic data?
Hi @Pavamana15 here on GitHub, we typically file a separate issue for each topic you'd like to discuss. This helps keep the GitHub organized for other users who may have similar issues, and for tracking these in the future. With this in mind, will you please file new issues for the two new topics of performance testing and quality? Appreciate your help in keeping our GitHub organized.
I will close out this initial issue because it seems like the original problem (error that you were seeing) has been resolved. Thanks.
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
I used one of the available multi-sequence data sets online to generate a synthetic dataset. But I am getting the following errors.
Steps to reproduce