As a user I can relevant data in model so that I can receive correct advice
As a AI team member I want to choose appropriate datasets to achieve efficient model training and produce better quality results
Assumptions or Pre-Requisites:
We will be training a Gemini model, which already has the core Gemini knowledge (presumably the same as when we pull up Google gemini in a browser window?)
Therefore should we be only training the model on stuff it definitely does NOT know already?
We have observed that Gemini doesnt seem to 'know' about specific datasets, e.g.those held on gov.ie , SO we have a working theory that extracting such data and 'teaching' it to the AI will help make the reponses more relevant and specific to the Irish Housing market.
Acceptance Criteria: (Must be completed before task is moved to 'Done')
[ ] Must have datasets in JSONL format for model training
[ ] Data must be judged relevant to a Housing application
Tasks
[x] Task1: Data Collection: Gathering relevant housing data from government agencies, real estate databases, and other sources.
[x] Task2: Data Cleaning and Preprocessing: Handling missing values, outliers, and inconsistencies in the data.
[x] Task3: Feature Engineering: Creating new features or transforming existing ones to improve model performance.
[x] Task4: Exploratory Data Analysis (EDA): Analyzing the data to identify patterns, trends, and correlations
Before changing task status to 'Review' or 'Done' please provide comment (and screenprints if appropriate) as documentary evidence of task completion
EPIC: AI-tasks
As a user I can relevant data in model so that I can receive correct advice As a AI team member I want to choose appropriate datasets to achieve efficient model training and produce better quality results
Assumptions or Pre-Requisites:
Acceptance Criteria: (Must be completed before task is moved to 'Done')
Tasks
Before changing task status to 'Review' or 'Done' please provide comment (and screenprints if appropriate) as documentary evidence of task completion