Closed shogg closed 5 years ago
@shogg thank you for investigation! This actually makes sense. Updated description - https://github.com/plutov/practice-go/blob/master/nasacollage/README.md
I did the scraping of archivepix.html. Still not fast but works well.
It took 22 min to gather all image URLs. With the API it took four hours to collect 2400 URLs because of the request limit. And the result was not exhaustive because requests sporadically returned server errors.
@shogg have you been able to build a collage after that? :)
I'm working on it again since Wednesday. I have a working algorithm that spits out best results found so far and searches on. But as I suspected, the problem is too complex to be fully searched in a reasonable time.
First idea was to limit the collage to a ground row of up to three images choosen from all images. But even that is too expensive: 8000!/(8000-3)! ~= 10^11.7 ground row variations. At the moment I'm testing with much fewer variations and some other restrictions.
PS: brute force would be to iterate variations of 11 images and check if they form a rectangle when stacked in that order: 8000!/(8000-11)! ~= 10^43 checks.
It's very slow and generates HTTP 500 and other errors. You'll lose some pictures even when obeying the quota of 1000 requests/h. After a while it stopped working at all for me. (While writing this I tested again: works again but errors remain)
API is unreliable but downloading the linked pictures in the API data works like a charm. I already used https://apod.nasa.gov/apod/archivepix.html to extract the dates needed to query the API. I think it could be easier to scrape the needed URLs out of the linked pages and not use the API at all. No quota involved. Just download +8500 files in one batch.
PS: I used this to slowly hammer the API:
cat dates04.txt | xargs -I{} curl https://api.nasa.gov/planetary/apod?api_key=<mykey>\&date={}
I have 11 dates??.txt files with ~800 lines each. dates04.txt: https://drive.google.com/file/d/1UstQi3wGO5bzDvAd7AnrwmijIZw4zd6H/view?usp=sharing