ltenzil / scrape

Scrape talks about building reports on user entered keywords and its search results (Google)
0 stars 0 forks source link

CSV upload validation handling #11

Open longnd opened 2 years ago

longnd commented 2 years ago

The current validation for CSV file upload isn't sufficient. It just relies on the exception raised and there is no actual validation on the uploaded file https://github.com/ltenzil/scrape/blob/6eb163261a9102f748fc789147840f7856584ef8/app/controllers/keywords_controller.rb#L74-L88

It can be improved to address the following issues

  1. Actually check if the uploaded file is a valid CSV file type based on the file extension and MIME-type
  2. Handle unexpected format. The application is expecting the file in this format
    keyword_1, keyword_2, keyword_3

    but this is also a valid CSV format

    keyword
    keyword_1
    keyword_2
    keyword_3

    if the user uploads the CSV file in the 2nd format, the app treats all of the content as a single keyword "keyword keyword_1 keyword_2 keyword_3". It's better to provide the user with a file template to follow.

ltenzil commented 2 years ago

hmm , ok got it. will refactor it. I thought the second format is text file.

longnd commented 2 years ago

infact, the second format is valid, refer to rfc4180. It has a header line and 3 records, each record is a single search keyword.

ltenzil commented 2 years ago

resolved @ #24

longnd commented 2 years ago

I think the validation for file upload still isn't sufficient. It still relies on the exception raised and there is no actual validation on the uploaded file, e.g. based on the file extension or Mime type. Besides that, it would be better to made the fix for each issue in a separate PR :)