corincerami / mars-photo-api

A Rails API for photo data from NASA's Mars Rovers
https://api.nasa.gov/#MarsPhotos
GNU General Public License v3.0
358 stars 48 forks source link

Queries with pagination return not all photos. #65

Closed pavelrad closed 6 years ago

pavelrad commented 7 years ago

First I faced specific API behavior that it returns photos in different orders when they are queried with or without pagination: https://mars-photos.herokuapp.com/api/v1/rovers/curiosity/photos?sol=865 -> [60, 787, 4771, 9816] https://mars-photos.herokuapp.com/api/v1/rovers/curiosity/photos?sol=865&page=1 -> [4771, 60, 9816, 787] But then I verified the result with pagination on a day with the larger amount of photos, and it turned out that API returns some photos several times and some photos doesn't return at all. For sol 1000 for Curiosity with pagination, it returns 677 unique photos (instead of 856) and 179 photos are returned twice. If per_page > 202 it returns all photos :) Maybe this is due to the lack of sorting in output?

corincerami commented 6 years ago

Hey @pavelrad, sorry for the delayed response. I pushed up a change today that I believe fixed this issue. In order to verify the behavior is working as intended now, I used the following code:

(1..35).each do |page|
  url = "https://mars-photos.herokuapp.com/api/v1/rovers/curiosity/photos?sol=1000&page=#{page}"
  uri = URI(url)
  response = Net::HTTP.get(uri)
  photos = JSON.parse(response)["photos"]
  ids += photos.map { |p| p['id'] }
end
=> 1..35
ids.count
=> 856
ids.uniq.count
=> 856

I'm going to close this issue, but if you think the problem still exists (or I've introduced a new one with this change) feel free to let me know and I'll reopen it.

pavelrad commented 6 years ago

Hi @chrisccerami, using my example it looks like the outputs are the same now, but sorting is not as expected:

$ curl -sS 'https://mars-photos.herokuapp.com/api/v1/rovers/curiosity/photos?sol=865' | jq -c '[.photos | .[] | .id]'
[60,4771,787,9816]
$ curl -sS 'https://mars-photos.herokuapp.com/api/v1/rovers/curiosity/photos?sol=865&page=1' | jq -c '[.photos | .[] | .id]'
[60,4771,787,9816]

For me it's expected that ids will be returned sorted, like in the following example:

$ curl -sS 'https://mars-photos.herokuapp.com/api/v1/rovers/curiosity/photos?sol=865' | jq -c '[.photos | .[] | .id] | sort'
[60,787,4771,9816]
corincerami commented 6 years ago

The photos are only sorted by ID after first being sorted by camera. ID is not a particular meaningful field so I don't see a compelling reason why photos should be sorted by ID. That's why you're not seeing them strictly ordered by ID. They are however not sorted consistently, first by camera_id and then by ID.

pavelrad commented 6 years ago

I see. Thank you for the explanation.