ContentMine / quickscrape

A scraping command line tool for the modern web
MIT License
259 stars 42 forks source link

Quickscrape fails to properly santise filenames #85

Closed tarrow closed 8 years ago

tarrow commented 8 years ago
info: processing URL: http://journals.humankinetics.com/jpah-current-issue/jpah-volume-13-issue-5-may/a-qualitative-investigation-of-australian-youth-perceptions-to-enhance-school-physical-activity-the-environmental-perceptions-investigation-of-childrenrsquos-physical-activity-epic-pa-study
fs.js:916
  return binding.mkdir(pathModule._makeLong(path),
                 ^

Error: ENAMETOOLONG: name too long, mkdir '/Users/pm286/workspace/cmdev/norma-dev/xref/daily/2016-05-01/20160501_60x/http_journals.humankinetics.com_jpah-current-issue_jpah-volume-13-issue-5-may_a-qualitative-investigation-of-australian-youth-perceptions-to-enhance-school-physical-activity-the-environmental-perceptions-investigation-of-childrenrsquos-physical-activity-epic-pa-study'
    at Error (native)
    at Object.fs.mkdirSync (fs.js:916:18)
    at processUrl (/Users/pm286/workspace/cmdev/quickscrape/bin/quickscrape.js:229:8)
    at Timeout.checkForNext [as _repeat] (/Users/pm286/workspace/cmdev/quickscrape/bin/quickscrape.js:193:7)
    at Timeout.wrapper [as _onTimeout] (timers.js:417:11)
    at tryOnTimeout (timers.js:224:11)
    at Timer.listOnTimeout (timers.js:198:5)