Tech Interview Checklist and Approachement
1. WebUI
- [x] Sign in - ✅ WITH AUTH0
- [x] Sign up - ✅ WITH AUTH0
- [x] Upload a keyword file - ✅ DONE WITH PAPAPARSE
- [x] View list of keywords - ✅ REACT-TABLE
- [x] View the search result information for each keyword - ✅ SHOW WITH REACT-TABLE, htmlCode were embedded into a jsonblob link via their APIs
- [x] Search across all reports - ✅ DATABASE ACTION WITH
pg
2. API
Leveraged the NextJS api routes mostly and serverless with lambda
- For uploading - POST /api/search - integrated with Lambda behind the scence
- Lambda Logic Repository: https://github.com/php1301/lambda-google
- For the sake of quick developement - leveraged the
serverless
framework with free tier AWS account
- Searching in Database - POST /api/user-search
3. Technical Requirements
- ✅ Use a web framework of your choice - NextJS
- ✅ Use PostgreSQL.
- ✅ For the interface, front-end frameworks such as Bootstrap, Tailwind or Foundation can be used. Use SASS as the CSS preprocessor - Used Tailwind
- ✅ Extra points are provided to the neatness and user-friendliness of the frontend.
- ✅ Use Git during the development process. Push to a public repository on Github or Gitlab. Make regular commits and merge code using pull requests - Integrated linter, unit testing with Github Actions
- ✅ Write tests using your framework of choice.
- Optional: deploy the application to a cloud provider e.g. Heroku, AWS, Google Cloud or Digital Ocean -> Working on this or cut this off due to free tier eligible
4. Approachment
- Overview of ER Diagram for database:
- ⛔⛔ The 429 Too Many Requests - it's all about tricking Google to not blocking our request
- Request factor: Can come from UserAgent, Remote address, IP, Header, Cookies, Fingerprint, Headless Browser, Request random delay...
- Most viable and easiest approachments are all about rotating those above: mostly are UserAgent and IP address with Paid Proxy (Costly) or setup our own Proxy server(tor) SOCKS -> this would lead to the optional requirement - deploying on Cloud Provider
- => Cost Optimization, Fast, Headless like Puppeteer is not optimized, Premium proxy maybe is too overkill for this technical assignment
- => ✅✅✅ Lambda Free Tier Approachment is suitable for this workload -> over freetier can consider about EventBridge for CronJob Daily scraping and Thanks To AWS generous IP pool
- => Rotating Lambda IP -> Best trick here, we update lambda configuration like Environment
- => not guaranteed 100% percent all the time (approx 80-90% of not having 429) -> implement the Axios-Retry with the trick above for new IP
- => Stil Not guranteed -> Redeploy the lambda function via aws-sdk or serverless script -> Tried and worked
5. Screenshot
- Homepage
- When uploaded keyword
- 94 keywords scraped
- Request's time
- Database
- Searching keyword
6. Limitations
- Due to the time limit, the source code maybe not on its best practice (Working on It by actively pushing commit)
7. Reproduction
- run
yarn
and add necessary env variables in .example.env
- available routes:
- homepage:
/
- keyword searching in database: '/my-keywords'
- Upload csv of keywords: '/search'
- Create Database with SQL script
- keywords.csv in src/mocks folder
Available Scripts
Running the development server.
yarn dev
Building for production.
yarn build
Running the production server.
yarn start
TailwindCSS
A utility-first CSS framework packed with classes like flex, pt-4, text-center and rotate-90 that can be composed to build any design, directly in your markup.
Go To Documentation
SASS/SCSS
Sass is a stylesheet language that’s compiled to CSS. It allows you to use variables, nested rules, mixins, functions, and more, all with a fully CSS-compatible syntax.
Go To Documentation
Axios
Promise based HTTP client for the browser and node.js.
Go To Documentation
Environment Variables
Use environment variables in your next.js project for server side, client or both.
Go To Documentation
Reverse Proxy
Proxying some URLs can be useful when you have a separate API backend development server and you want to send API requests on the same domain.
Go To Documentation
React Query
Hooks for fetching, caching and updating asynchronous data in React.
Go To Documentation
react-use
A Collection of useful React hooks.
Go To Documentation
Zustand
A small, fast and scalable bearbones state-management solution using simplified flux principles.
Go To Documentation
ESLint
A pluggable and configurable linter tool for identifying and reporting on patterns in JavaScript. Maintain your code quality with ease.
Go To Documentation
Prettier
An opinionated code formatter; Supports many languages; Integrates with most editors.
Go To Documentation
lint-staged
The concept of lint-staged is to run configured linter (or other) tasks on files that are staged in git.
Go To Documentation
Testing Library
The React Testing Library is a very light-weight solution for testing React components. It provides light utility functions on top of react-dom and react-dom/test-utils.
Go To Documentation
Cypress
Fast, easy and reliable testing for anything that runs in a browser.
Go To Documentation
Docker
Docker simplifies and accelerates your workflow, while giving developers the freedom to innovate with their choice of tools, application stacks, and deployment environments for each project.
Go To Documentation
Github Actions
GitHub Actions makes it easy to automate all your software workflows, now with world-class CI/CD. Build, test, and deploy your code right from GitHub.
Go To Documentation
License
MIT