Closed seanmc9 closed 2 years ago
Hey, how many cards would you want to grab? Because the page dynamically loads the items and replaces the old ones. You can see that behavior if you check the DOM and scroll... Heres a screenshare of the DOM during scrolling:
Hope you can see that the amount of elements within the grid (all elements with role="gridcell"
) dynamically adapts. New elements get added, and more importantly, older elements get removed. So even If you scroll through the whole collection and run your query selector operation at the end you would olny get the last elements unfortunately.
The way I would implement that is:
document.querySelector(".AssetsSearchView--assets")
Its not the most difficult task but also not that trivial. Hope that helps!
Wow, thank you for such a thorough response!! I'll definitely have to check this out, thank you so much!
@seanmc9 Have you implemented this functionality? Just curious as I'm considering taking a stab at it. Thanks!
Hey, sorry no I haven't - would love to follow along with what you do though, will you be implmenting it on this repo?
@seanmc9 Not sure. I want to grab both the prices and token_id of n assets, so it would have additional functionality. Still looking at options.
@Jethro87 I like the idea to return more information, if you want to implement it, feel free, I can also help. I have implemented a scraper for the rankigs with scroll functionality (if you want to fetch more than the first 5 items) that works without having a mutations observer.
I would suggest to have another function (for example) OpenseaScraper.getOffers(slug, n)
that returns an array of items, for example:
const items = await OpenseaScraper.getOffers("cool-cats-nft", 3);
// and items being an array with objects, see below 👇
[
{
tokenId: 5265,
price: {
currency: "ETH",
amount: 7.82
}
},
{
tokenId: 2878,
price: {
currency: "ETH",
amount: 7.9
}
},
{
tokenId: 6218,
price: {
currency: "ETH",
amount: 7.9
}
}
]
@dcts Thanks, that's really helpful! That's precisely the functionality I'm looking for. Could you point me in the right direction on how to solve scrolling for n items? I investigated yesterday, but this is my first foray into scraping so it was very slow going. Appreciate it.
The mechanism I would recommend is the following:
scrapeItems
, which scrapes all items that are currently visible on the page. Each item contains a unique tokenId and a price. scrapeItems
and storing all results in the dictionalry, using the unique tokenId as a key.window.scrollBy(0,100)
)scrapeItems
again and updating the results object (if a tokenId already exsists, ignoring the result) Heres an idea how this could look like:
const resultDict = {};
const n = 10; // target items to fetch
// @todo: implement scrapeItems() and save results to resultDict
var currentScrollTop = -1; // initialize scroll position, to track when end of page is reached
const interval = setInterval(() => {
console.log("another scrol... dict.length = " + Object.keys(resultDict).length);
window.scrollBy(0, 100);
// @todo: run scrapeItems() again and update resultDict
const endOfPageReached = document.documentElement.scrollTop === currentScrollTop
const enoughItemsFetched = Object.keys(resultDict).length > n;
if(endOfPageReached || enoughItemsFetched) { //
clearInterval(interval);
console.log("🥳 End Reached. ResultDict:");
console.log(resultDict); // show scraped items
} else {
currentScrollTop = document.documentElement.scrollTop; // update current scroll position
}
}, 200);
If you copy past the above code in the devtools console inside any browser on any page you should see the page automatically scrolling. Like this:
I'm working on an update, will deploy a minimal version by ~tomorrow.
Awesome, looking forward to it
I published an update with more functionality as discussed here. The new functions are documented in the main README file of this repository. Let me know if it works or if anything is unclear and I'm happy to help.
Closing this issue, please open a new one if you have any issues with the new functionality :)
@dcts This is great, thanks! There was a small bug that wouldn't allow me to run it that I've fixed via PR. I've also added a new function to the PR here -> https://github.com/dcts/opensea-scraper/pull/7
@dcts has this functionality been removed? I find it very useful to be able to scroll down and get more results.
@alex-pcln how many items do you need from a given page? Because in the current version of the scraper you get the top 32 offers. Simply run:
let result = await OpenseaScraper.offers("cool-cats-nft");
If you need more than the top 32 offers you might want to downgrade to v2 or similar. If you want I can lookup the exact version if thats usefull to you. But I am wondering in what use case you would need more than 32 offers? 🤔
Thanks @dcts I downgraded to commit e01fea03e3
where it is still there.
Yes I need more than 32, actually I need all assets of a collection that are currently on sale. Unfortunately, it seems that I cannot get that from their API, so I'm using your scrapper. I just need to do some changes to detect the currency.
Well if you need more than 32 items we should bring that functionality back, and maybe call the function offersByScrolling
. If thats usefull to you I don't see any downside. I can add it back in.
That would be great, thanks. It's not the most efficient way to get all assets for sale but I couldn't find any other way at the moment.
@alex-pcln which of the following methods do you need?
offersByScrolling
: you input the slug of the collection you want to fetch, for example cool-cats-nft
.offersByScrollingByUrl
: you input the url that points to a filtered collection, for example https://opensea.io/collection/boredapeyachtclub?search[sortAscending]=true&search[sortBy]=PRICE&search[stringTraits][0][name]=Background&search[stringTraits][0][values][0]=Purple&search[stringTraits][1][name]=Earring&search[stringTraits][1][values][0]=Silver%20Hoop&search[stringTraits][2][name]=Eyes&search[stringTraits][2][values][0]=Bloodshot&search[toggles][0]=BUY_NOW
migrated to issue #29
I would like to be able to scrape all of the cards on a page for Id information, but I found that as currently implemented, this scraper only actually grabs the top 2-6 with this line in
OpenseaScraper.js
const cardsNodeList = document.querySelectorAll(".Asset--anchor");
. This works great for finding the floor, but what if I wanted to grab all of the cards, would this be possible? I tried a couple other options likequerySelectorAll
ing on'[role="gridcell"]'
but ultimately found no improvement - would you have any ideas on why it isn't able to grab the whole page worth of cards? Thanks so much!