msih-apify / Social-Media-and-Contact-Info-Extractor

Run this scraper for free: https://apify.com/vdrmota/contact-info-scraper
Apache License 2.0
0 stars 0 forks source link

Aggregate Data for Same Domain #2

Open MSIH opened 2 years ago

MSIH commented 2 years ago

The Actor creates a data file for each URL that contains data. If the website has several pages of data then several data files are created for each page.

Create a data file for each domain that contains data for all pages within that domain.

MSIH commented 2 years ago

loop thru all the data files if unique create an array using key

let unique = []
let key = "domain"
const { items } = await ResultsDataset.getData();
for (const record of items) {
if(

const jsonDataStorage = await Apify.openKeyValueStore('jsonDataStorage');
await jsonDataStorage.setValue(datasetTitle + 'raw', items)

const lookup = new Map();
for (const record of items) {
 const key = record['placeid'];

 if (!lookup.has(key)) {
 lookup.set(key, record);
 continue;
  }                
 }