instill-ai / instill-core

🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
https://www.instill.tech
Other
2.09k stars 99 forks source link

[Text] Regular expression for data cleansing #1131

Open ShihChun-H opened 3 days ago

ShihChun-H commented 3 days ago

Describe Your Proposed Tutorial

Issue Description

Current State

Why We Want to Change?

Proposed Change

Pseudo Recipe

# VDP Version
version: v1beta

component:
  text-0:
    type: text
    input:
      # "Array of text to be cleaned."
      texts:
      setting:
        # option 1
        clean-method: Regex
        # When the text is matched, it will be removed from the array of text.
        exclude-patterns: 
        # When the text is matched, it will be remained in the array of text.
        include-patterns:

        # option 2
        clean-method: Substring
        # When the text contains the substrings, it will be removed from the array of text.
        exclude-substrings: 
        # When the text contains the substrings, it will be remained in the array of text.
        include-substrings:
        # A flag indicating whether the substring matching is case-sensitive. When it is true, the matching is case-sensitive. When it is false, the matching is case-insensitive. The default value is false. For example, when it is case-sensitive, cat would only match 'cat' but not 'Cat' or 'CAT'. When cat is case-insensitive, on the other hand, would match 'cat', 'Cat', 'CAT', or any other variation of uppercase and lowercase letters.
        case-sensitive: 

    condition:
    task: TASK_CLEAN_DATA

Rules for the Component Hackathon


Component Contribution Guideline | Documentation | Official Go Tutorial

linear[bot] commented 3 days ago

INS-6538 [Text] Regular expression for data cleansing

NailaRais commented 3 days ago

@ShihChun-H I am passionate about making a positive contribution.

ShihChun-H commented 3 days ago

Hi @NailaRais, Fantastic! I've assigned the issue to you! Please make sure to read and follow the rules stated above 🙌🏻

NailaRais commented 3 days ago

Hi @NailaRais, Fantastic! I've assigned the issue to you! Please make sure to read and follow the rules stated above 🙌🏻

Thank You