Closed ekt1701 closed 8 years ago
I don't know python very well, but from looking at the requests
documentation for a bit, I think doing something like this may work.
import requests
payload = {'url': 'http://www.radcircle.com', 'json_data': '{"title": "title"}'}
r = requests.post("http://www.jamapi.xyz", data=payload)
print(r.json())
From the looks of it you have to make an object for the form parameters. If this works for you feel free to close this issue and I'll also add this to the code samples in the readme.
Thank you so much, your code works for me.
BTW, using http://www.radcircle.com gives a database error: {u'title': u'Database Error'}
Yeah, I think there's an issue with their site, but you can sub http://www.radcircle.com
with any other url and it'll still work (so long as the page has a title element on it
Hello, using jamapi, how can I extract the following data in Python? I still don't understand how to code the json correctly. The actual page is http://earthquaketrack.com/us-ca-los-angeles/recent and I would like to get the first 5 events, not all 19 events. Thanks in advance.
<h4 class='title text-muted'>1.9 magnitude earthquake</h4>
<p>
<abbr class="timeago" title="2016-09-20T23:15:40Z">
2016-09-20 23:15:40 UTC
</abbr>
at 23:15 <br/>September 20, 2016 UTC
</p>
<p>
<strong>Location:</strong><br/>
Epicenter at 33.915, -118.304
<br/>
0.2 km from
<a href="/us-ca-gardena/recent">Gardena</a>
(0.2 miles)
</p>
It looks like the url you provided has blocked jamapi.xyz from accessing it. You'll have to deploy your own version of jamapi to heroku or something similar.
Here's the python code ran to determine that jamapi is actually forbidden, it returns an nginx error, and the title is 403 Forbidden
import requests
json = '{"title": "title", "body": {"elem": "body", "html": "html"}}'
payload = {'url': 'http://earthquaketrack.com/us-ca-los-angeles/recent', 'json_data': json}
r = requests.post("http://www.jamapi.xyz", data=payload)
print(r.json())
Thank you for looking into it. I'll have to find another way to get that data.
Sorry to bother you again, but what is the correct syntax in the json for:
<div class='post-body entry-content' itemprop='description articleBody'>
I have tried:
div[itemprop=description articleBody] gets: "error": "A provided CSS selector was not found on the provided "
div[itemprop='description articleBody'] gets: SyntaxError: invalid syntax
div[itemprop=\'description articleBody\'] gets: "error": "invalid JSON"
Maybe try w/ double quotes e.g. div[itemprop=\"description articleBody\"]
There should probably be a fix for this, but I'm not sure how long it'll take.
hmmm, when I tried that, I got the html for the home page: www.jamapi.xyz
EDIT, I'm getting that result, with code that worked before.
try changing http://www.jamapi.xyz
to https://www.jamapi.xyz
there's an issue with the update to ssl today
https, fixed the issue with the homepage.
However, div[itemprop=\"description articleBody\"] get "error": "invalid JSON"
Here is the entire json:
'json_data': '{"title": "title","paragraphs": [{ "elem": "div[itemprop=\"description articleBody\"] a:first-of-type", "text": "text"}]}'}
what's the url you're trying to get, maybe I can take a look at it?
http://doramaworld.blogspot.com/
I can get the title of each article with h3[itemprop=name], but not the body of the article.
I appreciate you taking a look.
EDIT: Is it possible to get the title and article in a single call?
Currently you can't get the body and title in one object, but what you can do is set your json_data
to be
{
"post_titles": [{"elem": ".post .post-title a", "link": "href", "name": "text"}],
"post_bodies": [".post .post-body"]
}
and this will return an array of post_titles
that has the post title, and the permalink to the post, and the post_bodies
array has all the post content in it.
Hello, I cannot find the correct syntax, I have tried this:
import requests r = requests.post('http://www.jamapi.xyz', 'url = "http://www.radcircle.com"', 'json_data = {"title": "title"}') print r.text
and get the error message: { "statusCode": 400, "error": "Bad Request", "message": "Invalid request payload JSON format" }
If I change it to this:
r = requests.post('http://www.jamapi.xyz', 'url = "http://www.radcircle.com"', json_data = '{"title": "title"}')
I get this: TypeError: request() got an unexpected keyword argument 'json_data'