Closed michelcojocaru closed 6 years ago
@michelcojocaru What kind of data are you interested in? Live ICOs, Upcoming?, Finished? Info? Time? Progress? All? Data as JSON object?
Live ICOs and Upcoming are the most important. Here is an example of info box you could crawl for each one. (icowatchlist)
Also we need to think of a strategy of aggregating the data because crawling different ICO websites will get us duplicates of the same ICO.. This can be an issue for the data model, because some icos may display on two websites and you could get x properties, but another ico could just appear on one website and have x-i properties.. So we need to compare the ico websites and come up with a data model that is consistent
Property comparison should be done at back-end. I have two suggestions. The first is a fixed proposal. It means that we agree beforehand on a fixed data structure and from each website we only keep the attributes that we need. ++ Easy, make life easier with loopback -- Static, missing info The second is dynamic. It means that we consider the data from the first website as a model and we build upon it. E.g. if the 2nd website has more attributes than the model, then we add them to the model ++ Rich, all info is aggregated -- Dynamic modules, maybe problematic with loopback
Done.
Crowl, parse, store & aggregate data from https://icowatchlist.com for front-end display