buigiaanfb1 / extract-data

0 stars 0 forks source link

[Feature] Keywords must be processed asynchronously #19

Open olivierobert opened 1 year ago

olivierobert commented 1 year ago

Issue

In the current implementation, keywords are processed one by one (in a loop) in the controller:

https://github.com/buigiaanfb1/extract-data/blob/10182a427a237113d64811701e57c7308d567509/app/controllers/search.controller.js#L28-L49

Since the code challenge allows users to upload up to 100 keywords, the upload process will likely cause the request to time out. In addition, if there is any error during the scraping of one keyword, the other keywords will not be processed. For instance, the deployed application crashed:

image

Expected

The key challenge in this coding exercise is to implement background jobs. While it is not specified/said as-is, we want candidates to figure out the following processing of keywords:

  1. Upload CSV
  2. Validate the uploaded keywords (not empty, not more than 100 keywords)
  3. Insert keywords in a database table (to ensure no data is lost, a flag attribute should be added to track the scraping status)
  4. Schedule one background job for each keyword
  5. Update keyword records with the scraping result