Closed teslashibe closed 2 months ago
Spike, needs acceptance criteria
Added acceptance criteria.
@lacyg4 @giovaroma this is blocked by: https://github.com/masa-finance/roadmap/issues/38
TLDR we cannot brainstorm this until we decide which data sets we will have available to train models on FLock.io
Loom video outlining the outcome of the spike : https://www.loom.com/share/830b0f0c70624c9da47eaa3e15ac99a3?sid=841dc6b1-954f-4b3b-9e1d-63009521857c
Next steps @lacyg4 @teslashibe : Identify quick wins and low-hanging fruit among datasets.
Prioritized datasets opportunities and identified sources of data to feed the data set types by various channel. Reference the image below to see details. This list includes
Some specific topics to cover on the datasets can be:
Action items: Scrape data and bundle by the last 30 days worth of data. We can limit it by 5k records to start.
@mudler @Luka-Loncar @lacyg4
Should be picked up by eng team from this ticket https://github.com/orgs/masa-finance/projects/14?pane=issue&itemId=71468733
Problem
We do not understand or know what static data sets are most valuable. By conducting an analysis across the data collected form the sales funnel on Airtable and aggregating collected user feedback will allow to conduct an analysis of dataset opportunities by analyzing overlapping requests and use cases.
Podcasts are very low hanging fruit because they are easy to generate.
Acceptance Criteria
Certainly! I'll create an acceptance criteria checklist based on the problem statement you've provided. This checklist will help ensure that the solution adequately addresses the issue of identifying valuable static datasets, with a focus on podcast data.
Acceptance Criteria Checklist
Data Collection and Aggregation π
Use Case Mapping and impact πΊοΈ
Opportunity Analysis π‘
Technical Feasibility Assessment π οΈ
Actionable Recommendations π
Additional checklist from Brendan:
Podcasts:
Extract text, diarize, vectorize