HealthInnovators / UniConvert

Uni-Convert is a user-friendly tool that leverages large language models (LLMs) to transform data from various formats into clean, well-structured JSON.
1 stars 1 forks source link

Input Classifier Module #10

Open elemenohpi opened 5 months ago

elemenohpi commented 5 months ago

A classifier that determines whether or not the input files are already in a standard format that can quickly be converted to the target format. In case the files are standard, it passes the input to the search/replace (deterministic converter) module. Otherwise, it passes the data to the LLM converter.

elemenohpi commented 5 months ago

Sometimes faulty input files need pre-processing. This is a new challenge to think about. Do we need a LLM filter or will it significantly slow down the process? Perhaps we could use common approaches in data science for handling some of the inputs.