Closed Ro-Den closed 1 year ago
Hello @Ro-Den . Any suggestions on how to get the new devices they exists under makers
@engahmed1190, the easiest way is to use your own sitemap-phones.xml
- extract the links with some other tool and add them to the file. Then edit the following line with the link to your file:
$sitemapSource = Http::get('https://www.gsmarena.com/sitemap-phones.xml');
Hello Have you managed to scrape the new devices? I desperately need that device list
You can re-scrape all data to get updated devices by reading the readme file. Have a nice day!
I also don't extrapolate the list of the most recent devices, it stops in February 2021
how did you solve it?
Thank for the rapid scraper update!
I have re-scraped the data but all recent devices (more than 400 models! to this date) are missing due to outdated
sitemap-phones.xml
. Looks like they update it once a year or even less frequently.Other minor issues:
-pictures-
&related.php
are skipped. And the links containing-3d-spin-
(360° view) should be skipped, too:if (!strpos($url->loc, 'related.php') && !strpos($url->loc, '-3d-spin-') && !strpos($url->loc, '-pictures-')) {
Battery
Type
(# 2) is still missing.Battery
Stand-by
(# 2) andBattery
Talk time
(# 2) are mysteriously present, but useless withoutBattery
Type
(# 2). Fortunately, the information is unimportant.E.g.: Nokia 5110
"Battery":{"Type":["Removable Li-Po 600 mAh battery"],"Stand-by":["40 - 180 h","60-270 h"],"Talk time":["2 h - 3 h 20 min","3-5 h"]}
data (sub-row) is "named" after the previous one and gets into the wrong array.E.g.: Nokia 5110
"Camera":{"Call records":["No"]}
"Display":{"Type":["Monochrome graphic"],"Size":[""],"Resolution":["5 lines","Dynamic font size\\r\\n Softkey\\r\\n Welcome message"]}
I named it "Additional Information" in my spreadsheet, but getting it all out from the wrong columns was rather time consuming.