InstaPy / instagram-profilecrawl

📝 quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.
MIT License
1.16k stars 245 forks source link

Use React State to get posts #153

Closed 1um closed 4 years ago

1um commented 5 years ago

Something like this can extract post info from the React State of the profile page. It was enough for me and significantly increase speed since you don't need to open each page. Hope, someone will find it helpful.

JSGetPostsFromReact = """
        var feed = document.getElementsByTagName('article')[0];
        var __reactInternalInstanceKey = Object.keys(feed).filter(k=>k.startsWith('__reactInternalInstance'))[0]
        var posts = feed[__reactInternalInstanceKey].return.stateNode.state.combinedPosts
        return posts;
    """
js_posts = browser.execute_script(JSGetPostsFromReact)

Example of data:

{
            "accessibilityCaption": "Image may contain: one or more people",
            "caption": "...",
            "code": "...",
            "commentsDisabled": false,
            "dimensions": {
                "height": 1317,
                "width": 1080
            },
            "gatingInfo": null,
            "hasRankedComments": false,
            "id": "...",
            "isSidecar": false,
            "isVideo": false,
            "location": {
                "hasPublicPage": true,
                "id": "...",
                "name": "San Francisco, California",
                "slug": "san-francisco-california"
            },
            "mediaPreview": "base64here",
            "numComments": 1,
            "numLikes": 43,
            "numPreviewLikes": 43,
            "overlayImageSrc": null,
            "owner": {
                "counts": {},
                "id": "...",
                "isNew": false,
                "username": "..."
            },
            "postedAt": 1551074843,
            "relatedMedia": [],
            "src": "...",
            "thumbnailResources": [
                {
                    "configHeight": 150,
                    "configWidth": 150,
                    "src": "...."
                },
                {
                    "configHeight": 240,
                    "configWidth": 240,
                    "src": "...."
                },
                {
                    "configHeight": 320,
                    "configWidth": 320,
                    "src": "..."
                },
                {
                    "configHeight": 480,
                    "configWidth": 480,
                    "src": "..."
                },
                {
                    "configHeight": 640,
                    "configWidth": 640,
                    "src": "..."
                }
            ],
            "thumbnailSrc": "..."
        }
prafulfillment commented 4 years ago

👍 Added this to my PR would love your review.