motebaya / Picuki

instagram bulk profile media downloader - scrape from picuki.com
MIT License
20 stars 13 forks source link

Error during scrape video #2

Closed redfalcoon closed 1 year ago

redfalcoon commented 1 year ago

python main.py -u zuck -v 14:54:43 debug:Using selector: EpollSelector ╭─────── information ────────╮ │ bio:

│ │ followers: 12,007,194 │ │ following: 523 │ │ full_name: Mark Zuckerberg │ │ total_posts: 284 │ │ username: @zuck │ ╰────────────────────────────╯ 14:54:44 info:media ID collected: 12 14:54:44 Warning:loading more page.... 14:54:44 info:media ID collected: 24 14:54:44 info:Total Media Collected: 24 14:54:44 info:Selected media: ['videos'], starting download.. 14:54:44 info:getting content from: 3144049196777682809 [1 of 24] ╭────────────────────────────────────────────────────────────────────── information ──────────────────────────────────────────────────────────────────────╮ │ caption: Threads reached 100 million sign ups this weekend -- within five days of launching. Thanks to all of you who are making this fun and friendly! │ │ comments_count: 3,947 │ │ likes_count: 142,286 likes │ │ name: @zuck │ │ time: 1 day ago │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:45 Warning:there's no videos or thumbnails to download.. 14:54:45 info:getting content from: 3140520095901479927 [2 of 24] ╭────────────────────────────────────────────────────────────────────────────────────────────────── information ───────────────────────────────────────────────────────────────────────────────────────────────────╮ │ caption: Meet Threads, an open and friendly public space for conversation. Our vision is to take the best parts of Instagram and create a new experience for text, ideas, and discussing what's on your mind. I │ │ think the world needs this kind of friendly community, and I'm grateful to all of you who are part of Threads from day one. Threads is available in the app store now. │ │ comments_count: 24,342 │ │ likes_count: 719,652 likes │ │ name: @zuck │ │ time: 6 days ago │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:45 info:Total videos/thumbnails Collected: 1 14:54:45 info:downloading videos ( 1 of 1) Completed..saved as: /root/New/Picuki/zuck/videos/Y8SaNcuF32xYNKj0hSxjBnfed.mp4 14:54:46 info:getting content from: 3140510476081615763 [3 of 24] ╭──────────────── information ────────────────╮ │ caption: Threads is here. Let's do this. 🔥 │ │ comments_count: 7,262 │ │ likes_count: 347,005 likes │ │ name: @zuck │ │ time: 6 days ago │ ╰─────────────────────────────────────────────╯ 14:54:47 Warning:there's no videos or thumbnails to download.. 14:54:47 info:getting content from: 3139898623990367685 [4 of 24] ╭──────────────────────────────────────────────────────────────────────────────────────────────── information ─────────────────────────────────────────────────────────────────────────────────────────────────╮ │ caption: Happy July 4th! 🇺🇸 Lots to be grateful for this year. As the big girls get older, I love talking to them about why America is so great. Looking forward to discussing with little Aurelia soon too. │ │ comments_count: 7,038 │ │ likes_count: 626,312 likes │ │ name: @zuck │ │ time: 1 week ago │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:47 Warning:there's no videos or thumbnails to download.. 14:54:47 info:getting content from: 3127421898368101810 [5 of 24] ╭───────────────────────────────────────────────── information ─────────────────────────────────────────────────╮ │ caption: Great learning from jiu jitsu legend @mikeymusumeci... and starting to prepare for our MMA debuts 😉 │ │ comments_count: 7,425 │ │ likes_count: 206,400 likes │ │ name: @zuck │ │ time: 3 weeks ago │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:48 info:Total videos/thumbnails Collected: 1 14:54:48 info:downloading videos ( 1 of 1) Completed..saved as: /root/New/Picuki/zuck/videos/uNhxneZ0LIYDd9NGmIH2W5BjN.mp4 14:54:49 info:getting content from: 3120828822040437226 [6 of 24] ╭───────────────────────────────────────────────────────────────────────── information ──────────────────────────────────────────────────────────────────────────╮ │ caption: Great to be back in person in Hacker Square for Meta's All Hands! So much energy and excitement for building the future of human connection together. │ │ comments_count: 3,372 │ │ likes_count: 98,093 likes │ │ name: @zuck │ │ time: 1 month ago │ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:49 info:Total videos/thumbnails Collected: 1 14:54:49 info:downloading videos ( 1 of 1) Traceback (most recent call last): File "/root/New/Picuki/main.py", line 274, in asyncio.run(_main( File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete return future.result() File "/root/New/Picuki/main.py", line 236, in _main await _download( File "/root/New/Picuki/main.py", line 58, in _download assert url is not None, "stopped, nothing url to download!" AssertionError: stopped, nothing url to download!

the program download the firt and second video ad the crash with error

motebaya commented 1 year ago

python main.py -u zuck -v 14:54:43 debug:Using selector: EpollSelector ╭─────── information ────────╮ │ bio: │ │ followers: 12,007,194 │ │ following: 523 │ │ full_name: Mark Zuckerberg │ │ total_posts: 284 │ │ username: @zuck │ ╰────────────────────────────╯ 14:54:44 info:media ID collected: 12 14:54:44 Warning:loading more page.... 14:54:44 info:media ID collected: 24 14:54:44 info:Total Media Collected: 24 14:54:44 info:Selected media: ['videos'], starting download.. 14:54:44 info:getting content from: 3144049196777682809 [1 of 24] ╭────────────────────────────────────────────────────────────────────── information ──────────────────────────────────────────────────────────────────────╮ │ caption: Threads reached 100 million sign ups this weekend -- within five days of launching. Thanks to all of you who are making this fun and friendly! │ │ comments_count: 3,947 │ │ likes_count: 142,286 likes │ │ name: @zuck │ │ time: 1 day ago │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:45 Warning:there's no videos or thumbnails to download.. 14:54:45 info:getting content from: 3140520095901479927 [2 of 24] ╭────────────────────────────────────────────────────────────────────────────────────────────────── information ───────────────────────────────────────────────────────────────────────────────────────────────────╮ │ caption: Meet Threads, an open and friendly public space for conversation. Our vision is to take the best parts of Instagram and create a new experience for text, ideas, and discussing what's on your mind. I │ │ think the world needs this kind of friendly community, and I'm grateful to all of you who are part of Threads from day one. Threads is available in the app store now. │ │ comments_count: 24,342 │ │ likes_count: 719,652 likes │ │ name: @zuck │ │ time: 6 days ago │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:45 info:Total videos/thumbnails Collected: 1 14:54:45 info:downloading videos ( 1 of 1) Completed..saved as: /root/New/Picuki/zuck/videos/Y8SaNcuF32xYNKj0hSxjBnfed.mp4 14:54:46 info:getting content from: 3140510476081615763 [3 of 24] ╭──────────────── information ────────────────╮ │ caption: Threads is here. Let's do this. fire │ │ comments_count: 7,262 │ │ likes_count: 347,005 likes │ │ name: @zuck │ │ time: 6 days ago │ ╰─────────────────────────────────────────────╯ 14:54:47 Warning:there's no videos or thumbnails to download.. 14:54:47 info:getting content from: 3139898623990367685 [4 of 24] ╭──────────────────────────────────────────────────────────────────────────────────────────────── information ─────────────────────────────────────────────────────────────────────────────────────────────────╮ │ caption: Happy July 4th! us Lots to be grateful for this year. As the big girls get older, I love talking to them about why America is so great. Looking forward to discussing with little Aurelia soon too. │ │ comments_count: 7,038 │ │ likes_count: 626,312 likes │ │ name: @zuck │ │ time: 1 week ago │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:47 Warning:there's no videos or thumbnails to download.. 14:54:47 info:getting content from: 3127421898368101810 [5 of 24] ╭───────────────────────────────────────────────── information ─────────────────────────────────────────────────╮ │ caption: Great learning from jiu jitsu legend @mikeymusumeci... and starting to prepare for our MMA debuts wink │ │ comments_count: 7,425 │ │ likes_count: 206,400 likes │ │ name: @zuck │ │ time: 3 weeks ago │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:48 info:Total videos/thumbnails Collected: 1 14:54:48 info:downloading videos ( 1 of 1) Completed..saved as: /root/New/Picuki/zuck/videos/uNhxneZ0LIYDd9NGmIH2W5BjN.mp4 14:54:49 info:getting content from: 3120828822040437226 [6 of 24] ╭───────────────────────────────────────────────────────────────────────── information ──────────────────────────────────────────────────────────────────────────╮ │ caption: Great to be back in person in Hacker Square for Meta's All Hands! So much energy and excitement for building the future of human connection together. │ │ comments_count: 3,372 │ │ likes_count: 98,093 likes │ │ name: @zuck │ │ time: 1 month ago │ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 14:54:49 info:Total videos/thumbnails Collected: 1 14:54:49 info:downloading videos ( 1 of 1) Traceback (most recent call last): File "/root/New/Picuki/main.py", line 274, in asyncio.run(_main( File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete return future.result() File "/root/New/Picuki/main.py", line 236, in _main await _download( File "/root/New/Picuki/main.py", line 58, in _download assert url is not None, "stopped, nothing url to download!" AssertionError: stopped, nothing url to download!

the program download the firt and second video ad the crash with error

hi, thanks for reporting the issue.. here solved in commit: https://github.com/motebaya/Picuki/commit/52896b9ab190f9279d7df5dd8b3a257edb9c39ee

 21:34:43 info:HTTP Request: GET https://www.picuki.com/media/3120828822040437226 "HTTP/1.1 200 OK" 
 21:34:43 debug:receive_response_body.started request=<Request [b'GET']> 
 21:34:43 debug:receive_response_body.complete 
 21:34:43 debug:response_closed.started 
 21:34:43 debug:response_closed.complete 
╭───────────────────────────────────────────────────────────────────────── information ──────────────────────────────────────────────────────────────────────────╮
│ caption: Great to be back in person in Hacker Square for Meta's All Hands! So much energy and excitement for building the future of human connection together. │
│ comments_count: 3,372                                                                                                                                          │
│ likes_count: 98,093 likes                                                                                                                                      │
│ name: @zuck                                                                                                                                                    │
│ time: 1 month ago                                                                                                                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
 21:34:43 info:Total videos/thumbnails Collected: 1 
 21:34:43 info:downloading videos ( 1 of 1) 
 Completed..saved as: /media/Picuki/zuck/videos/xBNn2jZeG5YdI9DNILh0HmuWN.mp4
redfalcoon commented 1 year ago

Thank You for the answer to the issue, I've updated the main and at start with the command

python main.py -u zuck -v

the program start to download videos but after the second video downloaded generate the error:

17:06:21 info:getting content from: 3120828822040437226 [6 of 24] ╭──────────────────────────────────────────── information ─────────────────────────────────────────────╮ │ caption: Great to be back in person in Hacker Square for Meta's All Hands! So much energy and │ │ excitement for building the future of human connection together. │ │ comments_count: 3,372 │ │ likes_count: 98,093 likes │ │ name: @zuck │ │ time: 1 month ago │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────╯ 17:06:21 info:Total videos/thumbnails Collected: 1 17:06:21 info:downloading videos ( 1 of 1) Traceback (most recent call last): File "/root/New/Picuki/main.py", line 287, in asyncio.run(_main( File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete return future.result() File "/root/New/Picuki/main.py", line 246, in _main await _download( File "/root/New/Picuki/main.py", line 59, in _download assert url is not None, "stopped, nothing url to download!" AssertionError: stopped, nothing url to download!

motebaya commented 1 year ago

Thank You for the answer to the issue, I've updated the main and at start with the command

python main.py -u zuck -v

the program start to download videos but after the second video downloaded generate the error:

17:06:21 info:getting content from: 3120828822040437226 [6 of 24] ╭──────────────────────────────────────────── information ─────────────────────────────────────────────╮ │ caption: Great to be back in person in Hacker Square for Meta's All Hands! So much energy and │ │ excitement for building the future of human connection together. │ │ comments_count: 3,372 │ │ likes_count: 98,093 likes │ │ name: @zuck │ │ time: 1 month ago │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────╯ 17:06:21 info:Total videos/thumbnails Collected: 1 17:06:21 info:downloading videos ( 1 of 1) Traceback (most recent call last): File "/root/New/Picuki/main.py", line 287, in asyncio.run(_main( File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/local/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete return future.result() File "/root/New/Picuki/main.py", line 246, in _main await _download( File "/root/New/Picuki/main.py", line 59, in _download assert url is not None, "stopped, nothing url to download!" AssertionError: stopped, nothing url to download!

delete all, and clone again. that issue because regex can't find video url and bs4 leave one element which have src attribute.

and that all are fixed in here: https://github.com/motebaya/Picuki/blob/main/lib/Picuki.py#L211 https://github.com/motebaya/Picuki/blob/main/lib/Picuki.py#L172

redfalcoon commented 1 year ago

I've done as you suggest and everythings work really fine! Thank you