oorestisime / gatsby-source-instagram

Create nodes from instagram posts hashtags and profiles
https://gatsby-src-instagram.netlify.com/
MIT License
149 stars 53 forks source link

Public scraping methods fail due to login screen on Instagram on production builds #24

Closed wjx0820 closed 1 year ago

wjx0820 commented 5 years ago

I use this plugin in my demo and could not work, it says could not fetch instagram posts, no Gatsby nodes generated(I did't use any token and just want to Public scraping for posts). So I cloned your repo, cd into /example, yarn install and run 'npm run develop'. And then it seems like the same problem happened. Wonder to know if i am missing something? Thanks!

oorestisime commented 4 years ago

Was it working with an access token before or public scraping. Several people reported that the way to get an access token might not be working so i am trying to understand if something broke there :)

if you weren't using an access token on 0.7.0 then can you try again public scraping on 0.8.0 ? make sure to use the correct username id as explained https://github.com/oorestisime/gatsby-source-instagram#public-scraping-for-posts

ngerbauld commented 4 years ago

yeii, it worked with public scraping! Thanksss @oorestisime !

oorestisime commented 4 years ago

I need to take some time to investigate the freaking mess of getting a valid access token. this is annoying. If anybody wants to research this i am happy to not do it myself :D :D

KishokanthJeganathan commented 4 years ago

Hey @oorestisime

Thanks for all this work on the plugin :) I ran into the same issue where I got this messege- The gatsby-source-instagram plugin has generated no Gatsby nodes

I then added the user ID of the Insta page instead of the username and it worked. Issue is that it gave me 50 posts from Instagram and they are not in the order they were published in :/ do you have any fix for this?

oorestisime commented 4 years ago

Timestamp should be available for each post so you could sort them based on that.

andregmoeller commented 4 years ago

Issue is that it gave me 50 posts from Instagram and they are not in the order they were published in :/ do you have any fix for this?

Hey @KishokanthJeganathan, as already mentioned, you need to sort the returned list of elements. I expect that something like

allInstaNode(limit: 50, sort: { fields: timestamp, order: DESC }) {
…
}

should work.

KishokanthJeganathan commented 4 years ago

@oorestisime @andregmoeller worked like a charm! Thanks heaps :)

Grsmto commented 4 years ago

Can anyone try getting a permanent token using this https://github.com/Bnjis/Facebook-permanent-token-generator ?

I tried and I don't think it works (anymore?). Also the latest Facebook API version 7.0 doesn't let you create a permanent token, it's only 60 days maximum afaik. If anyone has more infos!

PaulKleinschmidt commented 4 years ago

Hey @oorestisime, I may be missing something here

I'm trying to use the public scraping for a user's profile. After updating the gatsby-source-instagram to version 0.8.0, and updating my config to look like this (using instagram id instead of username):

{
   resolve: `gatsby-source-instagram`,
   options: {
     type: `user-profile`,
     username: `my-instagram-id`
   }
},

I'm getting the following error on my netlify build

4:49:05 PM: Could not fetch instagram user. Error status TypeError: Cannot destructure property `user` of 'undefined' or 'null'.
4:49:05 PM: error #11321 PLUGIN Cannot read property 'id' of null
4:49:05 PM: "gatsby-source-instagram" threw an error while running the sourceNodes lifecycle:
4:49:05 PM: Cannot read property 'id' of null
4:49:05 PM: See our docs page for more info on this error: https://gatsby.dev/issue-how-to
4:49:06 PM: 
4:49:06 PM: 
4:49:06 PM:   TypeError: Cannot read property 'id' of null
4:49:06 PM:   
4:49:06 PM:   - gatsby-node.js:76 createUserNode
4:49:06 PM:     [repo]/[gatsby-source-instagram]/gatsby-node.js:76:15

My query looks like this

const data = useStaticQuery(
    graphql`
      query {
        instaUserNode {
          edge_followed_by {
            count
          }
        }
      }
    `
  );

Thank you for the help!!!

oorestisime commented 4 years ago

Folks please this is getting to be a huge thread and sometimes answers are just in the Readme.

https://github.com/oorestisime/gatsby-source-instagram#public-scraping-for-a-users-profile this no longer works as i mentioned.

If you want to get the posts of a profile then use this https://github.com/oorestisime/gatsby-source-instagram#public-scraping-for-posts

joshua-isaac commented 4 years ago

seems like https://codeofaninja.com/tools/find-instagram-user-id went down? give's an error when trying to spit out the id. working for anyone else?

andregmoeller commented 4 years ago

seems like https://codeofaninja.com/tools/find-instagram-user-id went down? give's an error when trying to spit out the id. working for anyone else?

Please try the following steps:

  1. Open a browser and enter the URL of the respective user's Instagram profile.
  2. Add ?__a=1 to the URL – it should look similar to https://www.instagram.com/username/?__a=1
  3. Take a look at logging_page_id element. It's value should start with profilePage_. The number that follows should be the respective user's instagram id.
andregmoeller commented 4 years ago

I agree with @joshua-isaac. https://codeofaninja.com/tools/find-instagram-user-id does not work any longer. Yesterday, I posted an alternative way to determine the Instagram id. Today, I noticed that you don't need to append ?__a=1to the URL. You have just to open the respetive profile page in a browser, then open the source code (Firefox: CTRL+U / CMD+U, Chrome: CTRL+U / OPTION+CMD+U, Safari: OPTION+CMD+U) and search for profilePage_. The number that follows should be the Instagram ID.

It would be great if someone could confirm if it works or not. Thank you!

oorestisime commented 4 years ago

That's great @andregmoeller . Would you be able to open a PR to modify readme with this information?

andregmoeller commented 4 years ago

That's great @andregmoeller . Would you be able to open a PR to modify readme with this information?

Sure, I am going to open a PR tomorrow.

kuoloneous commented 4 years ago

I agree with @joshua-isaac. https://codeofaninja.com/tools/find-instagram-user-id does not work any longer. Yesterday, I posted an alternative way to determine the Instagram id. Today, I noticed that you don't need to append ?__a=1to the URL. You have just to open the respetive profile page in a browser, then open the source code (Firefox: CTRL+U / CMD+U, Chrome: CTRL+U / OPTION+CMD+U, Safari: OPTION+CMD+U) and search for profilePage_. The number that follows should be the Instagram ID.

It would be great if someone could confirm if it works or not. Thank you!

Can confirm it works.

Can we assume this method will no longer work after June 29th?

From https://www.instagram.com/developer/ :

Q: Why should I migrate to the Instagram Graph API platform? A: In January 2018, we publicly announced our plans to shut down the Instagram Legacy API platform through a sequenced approach. We plan to disable the final permission remaining on the Legacy API ("Basic Permission") on June 29, 2020 and any existing apps using the Legacy API will no longer have access. We encourage you to apply for permissions to Instagram Basic Display API and migrate Legacy API calls before June 29 to avoid interruption of service to your app and business. Note that App Review submissions can take up to a week or longer to process. Refer to the developer documentation to learn more.

Q: Can I continue using the Instagram Legacy API platform after June 29? A: No, the Legacy API platform will no longer be available as of June 29. We encourage you to apply for permissions to the Instagram Basic Display API via App Review.

oorestisime commented 4 years ago

Well i hope it will work through the summer since i will be a lot of time away and won't be able to fix. I thought it will work until end of september. My plan is to eventually find a nice easy way to generate graph api tokens so that everybody can use this method until the dust settles with all their API changes and find a convenient way to do public scraping.

meeroslav commented 4 years ago

I agree with @joshua-isaac. https://codeofaninja.com/tools/find-instagram-user-id does not work any longer. Yesterday, I posted an alternative way to determine the Instagram id. Today, I noticed that you don't need to append ?__a=1to the URL. You have just to open the respetive profile page in a browser, then open the source code (Firefox: CTRL+U / CMD+U, Chrome: CTRL+U / OPTION+CMD+U, Safari: OPTION+CMD+U) and search for profilePage_. The number that follows should be the Instagram ID.

It would be great if someone could confirm if it works or not. Thank you!

Perhaps even simpler would be to check your cookie for Instagram page and look for key ds_user_id (should be the last item in the cookie).

Aarekaz commented 4 years ago

image I followed a lot of guides and even did this and used the end id but I still get the same error. I just want to show my posts in my website nothing more. I spent my whole day looking into this, but still cant figure it out. My version is 0.8.0

ghost commented 4 years ago

@Aarekaz It works using the Graph API. I just created an IG/Facebook account a few days ago and, while it wasn't exactly smooth sailing, persistence paid off in the end. Make sure you follow the instructions carefully. Make sure your using an instagram business acount. Make sure you have a Facebook page linked to your Instagram Business account. And make sure you're not trying to use the temporary token (even though it might look the same at the beginning as the long-lease token).

Aarekaz commented 4 years ago

@balibebas Looks like its working. I just used the inspect element ID, But I doubt that is a temp solution!

Hope a permanent solution comes along with all this API changes.

jedifunk commented 4 years ago

unable to get the Graph API working, allInstaNode doesn't even show up in graphiql ... any thoughts?

peiche commented 3 years ago

I can't seem to get carouselImages working. Does someone have an example of this functioning?

samason commented 3 years ago

This issue has resurfaced for me just recently; public scraping works locally but is failing with the same error as earlier last year on Netlify builds. Seems like there's been an update on the Instagram side?

listiani13 commented 3 years ago

@samason ran into the same issue yesterday, seems to be working if you use the Graph API

brianshimkus commented 3 years ago

This issue also just resurfaced for me. I tried gatsby clean, uninstalling the packages and reinstalling, different versions, etc. without being able to get a successful build with Netlify.

alexanderdejong commented 3 years ago

Yeah it's happing to me too. I think perhaps it could be caused when you have too many builds and instagram gives the login wall. I will try again later and see if the problem persists.

brianshimkus commented 3 years ago

I tried another build today after doing nothing with the code since the comment above and now it builds just fine with Netlify. Weird.

samason commented 3 years ago

I ended up making an AJAX request to the same endpoint the plugin uses right within my React component, although you lose some of the benefits of this plugin that seems to be more reliable and has the added benefit of not requiring a site build to get updated posts from Instagram (and although it sounds like the legit, non public scraping method works, as people have mentioned above it's pretty tedious to set up).

ajdorexyz commented 3 years ago

I ended up making an AJAX request to the same endpoint the plugin uses right within my React component, although you lose some of the benefits of this plugin that seems to be more reliable and has the added benefit of not requiring a site build to get updated posts from Instagram (and although it sounds like the legit, non public scraping method works, as people have mentioned above it's pretty tedious to set up).

I was up until recently, calling a netlify lambda function and passing the required Instagram ID as a parameter to fetch data from the same endpoint this plugin uses. - (In a manner similar to what Wes Bos does in this video https://www.youtube.com/watch?v=9Ryc5MQTUYc).

recently in production I started getting 502 errors, and bad requests when running the functions on netlify. like @samason, I moved the logic from the serverless function directly into the react component, whereby I'm calling an async function inside the useEffect hook.

For the time being i'm not having any issues with this method, and for the particular usecase I have - been getting the aforementioned benefits of not having to trigger a rebuild to collect new instagram data, alongside live updates that you can cache within your component if neccessary, is pretty good.

oorestisime commented 3 years ago

Sorry for late reply. Seesm another login wall by instagram. Unfortunately i dont have time right now to see what i could do to bypass.

I think its clear instagram doesnt want us to public scraping :) i suggest to move to Graph API as soon as possible but there's nothing much i can do to save the public scraping

dbertella commented 3 years ago

@samason can you share some code example on what you've done? Can you fetch instagram as not authenticated user? Ok I see this right?

https://instagram.com/graphql/query/?query_id=17888483320059182&variables={"id":"${username}","first":100,"after":null}

I think I'll have a go, today my buids start to fail again even if I was using the id method

dbertella commented 3 years ago

Ok just in case someone want to follow this path this is my attempt to fetch all the instagram phost from the client

// https://github.com/oorestisime/gatsby-source-instagram/blob/master/src/instagram.js
const igUrl = (userId) =>
  `https://instagram.com/graphql/query/?query_id=17888483320059182&variables={"id":"${userId}","first":12,"after":null}`;

...

  const [allInsta, setAllInsta] = useState([]);
  useEffect(() => {
    fetch(igUrl(IG_ID))
      .then((j) => j.json())
      .then(({ data }) => {
        const photos = [];
        data.user.edge_owner_to_timeline_media.edges.forEach((edge) => {
          if (edge.node) {
            photos.push({
              id: edge.node.id,
              thumbnail: edge.node.thumbnail_resources[2].src, // here I'm getting some data I need to display later, but more are contained in the response
              caption: edge.node.edge_media_to_caption.edges[0].node.text,
            });
          }
        });
        setAllInsta(photos);
      });
  }, []);
LarsBehrenberg commented 3 years ago

I encountered this issue as well last year. And I don't think it's this plugin's fault, but rather down to Instagram doing weird stuff. In the meantime, I also went with fetching the posts manually and wrote a blog post about how to do so. Here are some links to explanation in the blog post, code, and working demo.

owenhoskins commented 3 years ago

Here are somelinks to explanation in the blog post, code, and working demo.

Hey @LarsBehrenberg, thanks for sharing! The method you describe is using the same end-point as the current Public scraping for posts:

https://www.instagram.com/graphql/query?query_id=17888483320059182&variables={"id":"${INSTAGRAM_ID}","first":${PHOTO_COUNT},"after":null}

I see that it is working in the demo. But I wonder if it would also suffer from the same "rate-limiting" login wall that the we've been hitting lately.

In my use-case, for an artist's representation agency, I am scrapping the posts of 60+ Instagram accounts and after a few builds within a short time-frame we face the login wall. I wonder if the dynamic component would suffer a similar fate if the page had many visitors and thus many requests to end-point?

LarsBehrenberg commented 3 years ago

@owenhoskins Good question! That might as well be the case. I haven't run into any issues using this method yet and been using it for the last 8 months or so. So I guess you'll just have to try out and see what happens... Sorry for not being more helpful :S

radscheit commented 3 years ago

@LarsBehrenberg Thanks for your effort and I like the approach. But currently, your demo isn't working any longer – at least for me:

Screenshot 2021-02-16 at 08 47 20

LarsBehrenberg commented 3 years ago

@LarsBehrenberg Thanks for your effort and I like the approach. But currently, your demo isn't working any longer – at least for me:

Screenshot 2021-02-16 at 08 47 20

Note sure, just checked and it's working fine for me.

radscheit commented 3 years ago

@LarsBehrenberg Thanks for your effort and I like the approach. But currently, your demo isn't working any longer – at least for me: Screenshot 2021-02-16 at 08 47 20

Note sure, just checked and it's working fine for me.

I can reproduce these errors by using the same IP address, but it works by changing my IP address via VPN. So is the consumption of that API depending on the end-users rate limit?

VT-Web-Development commented 3 years ago

image I followed a lot of guides and even did this and used the end id but I still get the same error. I just want to show my posts in my website nothing more. I spent my whole day looking into this, but still cant figure it out. My version is 0.8.0

That exactly what happened when I deployed to Netlify.

tijsluitse commented 3 years ago

Same here @VT-Web-Development! were you able to fix it? Having this issue on multiple websites now..

VT-Web-Development commented 3 years ago

Same here @VT-Web-Development! were you able to fix it? Having this issue on multiple websites now..

No - I am not using it for now. I will try another solutions.

joshua-isaac commented 3 years ago

Has anyone found a solution to this or a solution in general on how to get Instagram data in a react/gatsby app?

beamercola commented 3 years ago

@joshua-isaac The only thing that works for me is Zapier > Airtable

VT-Web-Development commented 3 years ago

@joshua-isaac The only thing that works for me is Zapier > Airtable

But you have to pay for it.

Aarekaz commented 3 years ago

Any updates? I have been having this issue and its preventing deploying. image

oorestisime commented 3 years ago

Hi, just to give a heads up.

There is nothing to do here for the plugin. Its on instagram to stop their paywalls i cant do anything :( The reason i am keeping this ticket open is because it has a lot of information for anyone who wishes to read.

SignetOHara commented 3 years ago

Hi everyone, thanks for this thread.

So just to clarify, if we use the Graph API method rather than public scraping are there still issues? If not, does anyone know of any up to date tutorial/guides on how to set up? Instagram/FB seem to want to make it as convoluted as possible...

I'm using public scraping during development and it usually works as long as the dev server isn't restarted numerous times (which fits with what everyone is saying). Haven't yet deployed to Netlify though.

mcljs commented 3 years ago

image Hello everyone today, in the morning I installed the gatsby-source-instagram plugin and it helped me, when I go to make other changes right now at night it no longer grabs me, I get this and it has not succeeded in grabbing my nodes. Somebody could help me?

tijsluitse commented 3 years ago

Hey there, I found out that my access token was no longer valid. After creating a new version via the steps explained here: https://www.gatsbyjs.com/plugins/gatsby-source-instagram/#instagram-graph-api-token, the public scraping worked again. You can check out your token here: https://developers.facebook.com/tools/debug/accesstoken/.