ParkingReformNetwork / organization-dashboard

Supporting code for an InfluxDB dashboard showing how the Parking Reform Network is doing over time, such as # of social media followers and # donors.
MIT License
0 stars 2 forks source link

Add Instagram # of followers and # of publications #5

Open Eric-Arellano opened 1 year ago

Eric-Arellano commented 1 year ago

We're going to do this by pulling https://www.instagram.com/parkingreform/. That page can be accessed without an account or API.

Part 1: new instagram service

Update index.ts to have code like this:

https://github.com/ParkingReformNetwork/organization-dashboard/blob/a73a7c92eb896e62a7807097ab04efcb270b7982/src/index.ts#L31

https://github.com/ParkingReformNetwork/organization-dashboard/blob/a73a7c92eb896e62a7807097ab04efcb270b7982/src/index.ts#L73-L78

Create a file called src/instagram.ts. Have similar contents to this:

https://github.com/ParkingReformNetwork/organization-dashboard/blob/a73a7c92eb896e62a7807097ab04efcb270b7982/src/mapProjects.ts#L1-L40

But all your parsePoints will do is return two hardcoded metrics for instagram-followers and instagram-publications. getCurrentPoints only calls parsePoints and doesn't yet download anything. No need for getHistoricalPoints.

Once you have this all wired up, make sure it works with npm start -- --services instagram. Then, get it working with npm run fmt and npm run lint. Once that's all good, git commit your changes.

Part 2: get the URL

Get the URL for PRN's Instagram with Axios, which is a tool to make HTTP requests in JavaScript. You should first spend some time learning what an HTTP request means, such as reading https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview or asking ChatGPT.

Then, add code like this:

const response = await axios.get(URL, { responseType: 'text' });
console.log(response.data);

This should print all the HTML of the page. No need to commit this part.

Part 3: parse the points

Step 1: install JSDom

JavaScript excels at understanding HTML pages. That's what it was originally created for!

Normally, when we run JavaScript via "node.js", though, rather than in a browser, we can't parse HTML. So, we need to install the tool JSDOM: https://github.com/jsdom/jsdom.

npm install --save jsdom

Commit this change.

Step 2: parse the HTML generally

We want to load the HTML with JSDom, then select the HTML element we care about. If you haven't used HTML yet, please check out some online articles/guides about HTML works.

Here's some sample code to select the footer that you could put in parsePoints:

    const dom = new JSDOM(response.data);
    const document = dom.window.document
    const footer = document.querySelector('footer'); 
    console.log(footer.outerHTML);

We don't care about the footer, but this is useful to see how parsing works.

In getCurrentPoints, you'll also want to pass the response.data from axios.get to parsePoints. While you're iterating, you can set the type as any.

Step 3: get the specific elements we care about

Now we need to figure out how to get access to the data we actually care about. This will require some investigation from you by using the Developer Console from your browser. On the PRN page, right click on the part you care about and click "Inspect Element"

Captura de pantalla 2023-06-28 a la(s) 7 52 24 p m

Look at the resulting HTML and determine how we can programmatically determine where the metrics we want are. For example, do they set a certain CSS class? Or an href value?

You'll then need to improve the querySelector('footer') to match this element. This is by using "CSS and HTML selectors". ChatGPT can be helpful with this. Start with figuring out how you would describe where the relevant metrics are in human terms, based purely on the HTML contents, and then we can figure out how to map this into code.

Step 4: add a test

This will be similar to this test:

https://github.com/ParkingReformNetwork/organization-dashboard/blob/a73a7c92eb896e62a7807097ab04efcb270b7982/tests/mapProjects.test.ts#L1-L21

Create instagram.test.js. Also save the HTML from the Instagram page to a file in mocks/instagram.html. Then, your test should read in that HTML file and pass it to your parsePoints function.

Eric-Arellano commented 1 year ago

After we looked more into the above plan, we realized that the above code would technically violate Instagram's Terms of Service, which we don't want to use. So, we either need to go through their API or manually get the data using https://github.com/ParkingReformNetwork/organization-dashboard/issues/43