intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.48k stars 1.24k forks source link

[Discussion] Operations needed to be supported in shards #5694

Open dding3 opened 2 years ago

dding3 commented 2 years ago

To support better user experience to use orca shards, created this issue to discuss which operations are needed to support in orca shards.

Above operations are motivated from below links: https://www.kaggle.com/code/pmarcelino/comprehensive-data-exploration-with-python operations used:

  1. isnull, sum, sort_values, standard_scaler, get_dummies
  2. nice to have: describe(summary of dataframe), correlation, arg_sort

https://www.kaggle.com/code/isaienkov/riiid-answer-correctness-prediction-eda-modeling operations used:

  1. isnull, sum, groupby, agg, merge, fillna, not
  2. nice to have: sklearn.feature_selection.rfe

https://www.kaggle.com/code/ammar111/youtube-trending-videos-analysis operations used:

  1. fillna, isna, value_counts, count, filter, groupby,
  2. nice to have: describe, most_common, corr, sort_values

https://www.kaggle.com/code/jiashenliu/introduction-to-financial-concepts-and-data operations used:

  1. filter, get pd series to np array and using numpy operation to process to create a new column
jason-dai commented 2 years ago

Please summarize for each example, what additional operations are needed