Open georgerabus opened 10 months ago
I will take a look at it and see what went wrong.
Expect a reply any moment from now
Psst: Thanks for the interest :)
Thanks to you too for still being active :)
btw can you specify more what
productlinks = []
data = []
categories = []
are these for? also they should contain strings of url separated by comma?
I will love to know the branch causing the error.
I mistakenly merged the two branches (Request and Scrapy)
If you can answer then it will be much easier to know what went wrong
As for the questtion btw can you specify more what
productlinks = []
This list holds the product links of the active scrapper for further processing
data = []
This is the list that holds the data to be written to CSV file
categories = []
This holds the category names which will be used later for mapping products to their parent and child categories
productlinks = []
This list holds the product links of the active scrapper for further processingdata = []
This is the list that holds the data to be written to CSV filecategories = []
This holds the category names which will be used later for mapping products to their parent and child categories
oh so these lists should not be touched by me, understood. I am willing to help you out to perfect this program (win:win), but unfortunately I will go to sleep now (2:30AM), I will reply to everything tomorrow morning
I will try to scrape AliExpress right now and see what went wrong. I have never tried Aliexpress before. We have almost the same timezone. Its 1:30am here,
I pray you wake up to find a good news from my end
I am also currious, did you by chance forget to change the html elements to match that of aliexpress.com
? As concerns the second to the last and last lines, chatgpt gave this answer
It seems like there's an issue with Java Runtime Environment (JRE) not being properly configured or available in your system. Here are a few steps you can take to resolve this:
Check Java Installation: Ensure Java is installed on your system. You can do this by running java -version in your command prompt or terminal to check if Java is properly installed and the version is displayed.
Set JAVA_HOME: Set the JAVA_HOME environment variable to point to your Java installation directory.
Update PATH: Add %JAVA_HOME%\bin to your PATH variable to ensure the system can find the Java executables.
Reinstall Java: If Java is not installed or if the installation is corrupted, consider reinstalling the latest version of Java from the official website.
If this issue occurs when executing specific commands or applications, check their documentation or support resources for any specific Java-related configurations or requirements.
I managed to get a workaround for aliexpress.com e-shop. But you should be notificed that you will need to buy proxy to get through
Their shop is embeded in a javascript so you either use selenium or a custom written API
I will have to revamp my code to include support for major e-shops
The attached screenshots show the completed scrapping progress.
I will update you when i am far gone with the modifications.
if you have other ways to make the script better, why not fork it, add the necessary changes and push
Lets keep our fingers crossed 🤞
I just woke up, but I'll go to bed back soon btw I'm using linux (arch), so that you know there are different audiences :))
if you have other ways to make the script better, why not fork it, add the necessary changes and push
Lets keep our fingers crossed 🤞
I am a beginner programmer so I don't know if i'll be able to do much, but thanks :) I'll try my best
edit: also yes i typed the urls with https://
I will take good note of the various OS
I got you well covered.
We have a long day ahead of us
Good and wonderful morning to you dude
HUGE UPDATE !!!
I want to inform you BTS my team and i have been working on a new update. We have created an API for this scrapper that allows you to call it and submit your link to be scrapped.
While we work on that, i will be pushing an update to this script that makes your scrapping process easy by elimimating the stress of always searching and replacing classes.
This may not be the update you were expecting but its better than nothing
Cheers 🥂
I haven't forgotten about your Aliexpress request...
also to clarify, at base_url I should include only the main url like
aliexpress.com
or more specific likehttps://www.aliexpress.com/p/calp-plus/index.html?spm=a2g0o.best.testStatic.5.32422c25ygT70g&osf=category_navigate_newTab2&queryFrom=kingKong&categoryTab=us_beauty_%26_health