Open mirkathaha opened 8 years ago
I hava a server database ;you can try remove the db pipeline int the settings!
ITEM_PIPELINES = { 'TianMao.pipelines.TianmaoPipeline': 300, 'TianMao.pipelines.DBPipeline': 300 }
you can try to remove the 'TianMao.pipelines.DBPipeline': 300,this line
Thanks, I've removed it but I still cant pull data from Taobao. Got the redirecting 302 error.
2016-05-17 14:37:06 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-05-17 14:37:06 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-05-17 14:37:10 [scrapy] DEBUG: Redirecting (302) to <GET https://login.taobao.com/jump?target=https%3A%2F%2Flist.tmall.com%2Fsearch_product.htm%3Ftbpm%3D1%26spm%3Da220m.1000858.1000724.4.tBcJlT%26cat%3D50025829%26q%3D%25B6%25CC%25D1%25A5%25C5%25AE%26sort%3Dd%26style%3Dg%26from%3D.list.pc_1_suggest%26suggest%3D0_2%26tmhkmain%3D0> from <GET https://list.tmall.com/search_product.htm?spm=a220m.1000858.1000724.4.tBcJlT&cat=50025829&q=%B6%CC%D1%A5%C5%AE&sort=d&style=g&from=.list.pc_1_suggest&suggest=0_2&tmhkmain=0#J_Filter>
2016-05-17 14:37:12 [scrapy] DEBUG: Redirecting (302) to <GET https://pass.tmall.com/add?_tb_token_=yUktmhy73cOS&cookie2=3ab58ca57fa6ab058722f8927322ec59&t=2ea56f154c9d80e51a3258ae40e6de4b&target=https%3A%2F%2Flist.tmall.com%2Fsearch_product.htm%3Ftbpm%3D1%26spm%3Da220m.1000858.1000724.4.tBcJlT%26cat%3D50025829%26q%3D%25B6%25CC%25D1%25A5%25C5%25AE%26sort%3Dd%26style%3Dg%26from%3D.list.pc_1_suggest%26suggest%3D0_2%26tmhkmain%3D0&pacc=Us0oRlM7phFEAiRjYkcrDA==&opi=128.199.107.85&tmsc=1463467031948199> from <GET https://login.taobao.com/jump?target=https%3A%2F%2Flist.tmall.com%2Fsearch_product.htm%3Ftbpm%3D1%26spm%3Da220m.1000858.1000724.4.tBcJlT%26cat%3D50025829%26q%3D%25B6%25CC%25D1%25A5%25C5%25AE%26sort%3Dd%26style%3Dg%26from%3D.list.pc_1_suggest%26suggest%3D0_2%26tmhkmain%3D0>
2016-05-17 14:37:13 [scrapy] DEBUG: Redirecting (302) to <GET https://list.tmall.com/search_product.htm?tbpm=1&spm=a220m.1000858.1000724.4.tBcJlT&cat=50025829&q=%B6%CC%D1%A5%C5%AE&sort=d&style=g&from=.list.pc_1_suggest&suggest=0_2&tmhkmain=0> from <GET https://pass.tmall.com/add?_tb_token_=yUktmhy73cOS&cookie2=3ab58ca57fa6ab058722f8927322ec59&t=2ea56f154c9d80e51a3258ae40e6de4b&target=https%3A%2F%2Flist.tmall.com%2Fsearch_product.htm%3Ftbpm%3D1%26spm%3Da220m.1000858.1000724.4.tBcJlT%26cat%3D50025829%26q%3D%25B6%25CC%25D1%25A5%25C5%25AE%26sort%3Dd%26style%3Dg%26from%3D.list.pc_1_suggest%26suggest%3D0_2%26tmhkmain%3D0&pacc=Us0oRlM7phFEAiRjYkcrDA==&opi=128.199.107.85&tmsc=1463467031948199>
ok,i will try it ! but i am working in the company...
No problem, I hope to make use of your script to scrape info from Taobao. It's very helpful.
Hear from you again.
I download the code and try it! the result is that it works fine! i just notes 'TianMao.pipelines.DBPipeline': 300 and changed nothing!
are you a foreigner? your problem is scrapy redirecting 302 error.but i do not has the same problem!May am I in china?I search the problem in Google and many peopel face the same problem! you can search "scrapy redirceting 302 error" in stackoverflow,Hope help you!
Yup I'm a foreigner, not based in China. Maybe that's why it's redirecting for me. I tried the solution form stackoverflow but it ain't working for me. Oh well :/
If you do come across a solution or found a fix to workaround 302 redirect in your script, please ping me. It would help me a lot. Thanks!
I managed to scrap the data successfully! Thanks for your script.
I have a question, how do i scrape from URL1 to URL28 automatically? thanks.
you are welcome! but from URL1 to URL28 automatically ,i hava no idea.....because i am good at android....
Sure, no problem. Thanks!
Hi there, I tried to run your scrapy script but there was no results. I have also created a SQL DB with table name goods_info but I'm still having issue. Can you help me out?
Connect to db successfully! 2016-05-17 14:12:21 [scrapy] INFO: Enabled item pipelines: ['TianMao.pipelines.DBPipeline', 'TianMao.pipelines.TianmaoPipeline'] 2016-05-17 14:12:21 [scrapy] INFO: Spider opened 2016-05-17 14:12:21 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2016-05-17 14:12:21 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023