Closed culda closed 2 months ago
I did a crawl using the dashboard and it ran until credits ran out. I want to query the content that was scraped but not sure how.
{'id': 'dbae218e-753a-489a-8893-66eb65b85fa3', 'user_id': '5efa2ec1-4bc0-4047-bc4f-1901ef695ee6', 'url': '5efa2ec1-4bc0-4047-bc4f-1901ef695ee6/www.gov.uk/www_uk/10002090080147424289.md', 'domain': 'www .gov.uk', 'created_at': '2024-09-01T11:25:34.315714+00:00', 'updated_at': '2024-09-01T11:25:34.315714+00:00', 'pathname': '/world/travelling-to-the-democratic-republic-of-the-congo', 'fts': "'/world/travelling-t o-the-democratic-republic-of-the-congo':1", 'scheme': 'https:', 'last_checked_at': '2024-09-01T11:25:34.213202+00:00', 'screenshot': False, 'status_code': 200}
I see a URL for the .md file but how do I access it.
Am I supposed to set webhooks for storing data in my own db as the crawl happens?
Thanks
I did a crawl using the dashboard and it ran until credits ran out. I want to query the content that was scraped but not sure how.
{'id': 'dbae218e-753a-489a-8893-66eb65b85fa3', 'user_id': '5efa2ec1-4bc0-4047-bc4f-1901ef695ee6', 'url': '5efa2ec1-4bc0-4047-bc4f-1901ef695ee6/www.gov.uk/www_uk/10002090080147424289.md', 'domain': 'www .gov.uk', 'created_at': '2024-09-01T11:25:34.315714+00:00', 'updated_at': '2024-09-01T11:25:34.315714+00:00', 'pathname': '/world/travelling-to-the-democratic-republic-of-the-congo', 'fts': "'/world/travelling-t o-the-democratic-republic-of-the-congo':1", 'scheme': 'https:', 'last_checked_at': '2024-09-01T11:25:34.213202+00:00', 'screenshot': False, 'status_code': 200}
I see a URL for the .md file but how do I access it.
Am I supposed to set webhooks for storing data in my own db as the crawl happens?
Thanks