Let's discuss, add tasks, modify, and then we can make them into separate issues and track via kanban.
Discuss how we track tasks and progress. The kanban in here? Something else? Do we want multiple github projects for frontend, backend, infrastructure, "project at large"?
Decide on where we will host. @robot-o wants cheap, eco power, in a country that respects human rights. I'd like API access, load balancers, "easy button", in addition.
It appears all pieces have been found, Think about the generic framework, not a finder for more pieces for this puzzle. Once we have confirmation I will delete all tasks in this that are specific to this puzzle
Decision on DB reached: 5 for Postgres, none for Maria, none for Santana
Decide on a timeline and minimum feature set for NG to go live.
Rerun confidence logic on fully hashed current DB and find solutions
Refactor DB somewhat, not full fat, but at least a little saner without needing major code changes for final puzzle finding
Same selection logic for API and legacy
Same data input validation and data handling for API and legacy
Do we want client IP hash back in selection logic so someone won't transcribe the same pic twice? Might be worthwhile with a smaller pool of transcribers
Change client IP hash to be peppered with HMAC everywhere. we could use DJANGO_SECRETS or whatever that env is called. This change may need us to throw away current hashes in DB, which is fine
Decide where we even need client IP/ID hash in DB
"Core" unit tests for data validation, confidence logic. The things that break us completely if they are wrong. Should cover legacy and API. "Days of debugging can you save you hours of writing tests"
Scrape image URLs into a local store and verify they are legit and not genitals
Change logic to serve images from local store
Do we accept new images at this point? If we do, do we still want to give priority to them?
Iterate on findImage, and have new image submission actually save the image to local store. Limit max size of image.
Discuss confidence logic and come to a good consensus
Discuss bad image logic and come to a good consensus
Bring in DB of known solutions from streamer data. What do we trust? How far?
Different page if user finds a new solution, extra celebration if it's verified
Stats at top of page, including # of pieces left to find (estimated) and net new pieces found, number of transcriptions last hour / day / week, number of solutions last hour / day /week, number of new pieces found last hour / day / week
Decide on backup strategy. Regular snapshots? Regular DB dumps via crontab? Probably both in case of major code snafu that screws DB contents.
Document team experts in specific categories, so that when someone goes "I need help with", they know whom to hit up
If we are feeling antsy and if final will be long running:
User ID, name and generated pass phrase (correct horse battery staple)
Ability to change pass phrase if lost, but that would require an email - we can ask for it optional when creating an account with a warning that not giving email makes it impossible to reset pw
Pw requirement N length (12?), but no other requirements for complexity
Store name, peppered pw, email address, registration time, last logon, last submission
Submissions linked to user ID
Maybe not for this but for the general: Rate limiting. Can only submit a solution every N1 seconds, and a "bad Image" flag every N2 seconds. Any account attempting to circumvent this gets auto-banned with a page explaining appeals process
Personal stats, as well as leaderboard across top. Gamify it!
Think about cost covering and charity - do we take donations? If so we need to be very transparent about how much we received, how much covered our costs, and how much went to charity, verifiable. There may be some real temptation once money enters the picture.
Let's discuss, add tasks, modify, and then we can make them into separate issues and track via kanban.
Discuss how we track tasks and progress. The kanban in here? Something else? Do we want multiple github projects for frontend, backend, infrastructure, "project at large"?
Decide on where we will host. @robot-o wants cheap, eco power, in a country that respects human rights. I'd like API access, load balancers, "easy button", in addition.
It appears all pieces have been found, Think about the generic framework, not a finder for more pieces for this puzzle. Once we have confirmation I will delete all tasks in this that are specific to this puzzle
Decision on DB reached: 5 for Postgres, none for Maria, none for Santana
Decision on backend stack pretty clear: 5 NodeJS + Express, 1 NodeJS + SailJS, 0 each Python + Flask / Django
Decide on a timeline and minimum feature set for NG to go live.
Rerun confidence logic on fully hashed current DB and find solutions
Refactor DB somewhat, not full fat, but at least a little saner without needing major code changes for final puzzle finding
Same selection logic for API and legacy
Same data input validation and data handling for API and legacy
Do we want client IP hash back in selection logic so someone won't transcribe the same pic twice? Might be worthwhile with a smaller pool of transcribers
Change client IP hash to be peppered with HMAC everywhere. we could use DJANGO_SECRETS or whatever that env is called. This change may need us to throw away current hashes in DB, which is fine
Decide where we even need client IP/ID hash in DB
"Core" unit tests for data validation, confidence logic. The things that break us completely if they are wrong. Should cover legacy and API. "Days of debugging can you save you hours of writing tests"
Scrape image URLs into a local store and verify they are legit and not genitals
Change logic to serve images from local store
Do we accept new images at this point? If we do, do we still want to give priority to them?
Iterate on findImage, and have new image submission actually save the image to local store. Limit max size of image.
Discuss confidence logic and come to a good consensus
Discuss bad image logic and come to a good consensus
Bring in DB of known solutions from streamer data. What do we trust? How far?
Different page if user finds a new solution, extra celebration if it's verified
Stats at top of page, including # of pieces left to find (estimated) and net new pieces found, number of transcriptions last hour / day / week, number of solutions last hour / day /week, number of new pieces found last hour / day / week
Decide on backup strategy. Regular snapshots? Regular DB dumps via crontab? Probably both in case of major code snafu that screws DB contents.
Document team experts in specific categories, so that when someone goes "I need help with", they know whom to hit up
If we are feeling antsy and if final will be long running:
Maybe not for this but for the general: Rate limiting. Can only submit a solution every N1 seconds, and a "bad Image" flag every N2 seconds. Any account attempting to circumvent this gets auto-banned with a page explaining appeals process
Personal stats, as well as leaderboard across top. Gamify it!
Think about cost covering and charity - do we take donations? If so we need to be very transparent about how much we received, how much covered our costs, and how much went to charity, verifiable. There may be some real temptation once money enters the picture.