Open Darius18 opened 2 months ago
@Darius18 Thank you for your interest in VRECS
Regarding the error, the problem is how you are passing the dataset.
To fix this problem and use it correctly, you should:
parse the dataset using the following function, which loads the dataset and converts it into the expected shape
def load_data(uploaded_file:pd.DataFrame)->Union[str, Tuple[list, pd.DataFrame]]:
"""
Load and preprocess data from an uploaded file.
Parameters
----------
uploaded_file : pd.DataFrame
The uploaded file containing data to be loaded.
Returns
-------
Union[str, Tuple[List[Tuple[str, str]], pd.DataFrame]]
A tuple containing:
- A list of tuples with column names and their corresponding custom data types.
- The loaded DataFrame with columns renamed to lowercase.
In case of an error, returns a string with the error message.
"""
try:
encoding = utils.detect_encoding(uploaded_file)
delimiter = utils.detect_separator(uploaded_file, encoding)
df = pd.read_csv(uploaded_file, encoding=encoding, on_bad_lines='skip', delimiter=delimiter)
df = df.rename(columns=lambda x: x.lower())
column_types = [(col, utils.map_dtype_to_custom(dtype)) for col, dtype in df.dtypes.items()]
return column_types, df
except Exception as e:
return f"Error loading file: {e}"
Your task is to recommend and explain to a user the best visualization for a given dataset using the VegaZero template.
Here are the available options: mark [T], encoding x [X], y [Y], aggregate [AggFunction], color [Z], transform filter [F], group [G], bin [B], sort [S], topk [K].
Let's think step by step.
{query}
{dataset}
However, if you want to avoid writing the code, I have fixed the readme to launch the V-RECS web app so that you can test it quickly. (All code is available in src folder along with the documentation in docs folder)
Thank you again for your interest, and I'm looking forward to hearing if you can fix it. Moreover, this is my first contribution so that any feedback would be great!
I encountered an issue while using the DeepvizLab/vrecs model (from huggingFace) for generating VegaZero code based on a CSV dataset. The model appears to generate incorrect or irrelevant responses when provided with a CSV dataset and a specific query.
I loaded a CSV file named score.csv with the following structure:
and here is my code:
it worked but the answer is just incorrect... answer:
Environment: Python Version: 3.x CUDA Version: 12.2 GPU: NVIDIA 4060 Ti
Additional Notes:
the paper is great and thanks for your help!