Closed mschwamb closed 7 years ago
@mschwamb It looks like the HiRISE site may have changed since the last time we ran this. Could you just confirm which image link we should be downloading from here: http://hirise.lpl.arizona.edu/ESP_040246_0935
At the moment the download script is getting the RGB colour non-map projected JPEG file (http://hirise-pds.lpl.arizona.edu/PDS/EXTRAS/RDR/ESP/ORB_040200_040299/ESP_040246_0935/ESP_040246_0935_RGB.NOMAP.browse.jpg) but it seems to be expecting to grab a j2k/jp2 file. Should it be downloading the last file in the "JP2 Extras" section (http://hirise-pds.lpl.arizona.edu/download/PDS/EXTRAS/RDR/ESP/ORB_040200_040299/ESP_040246_0935/ESP_040246_0935_RGB.NOMAP.JP2)?
@astopy my understanding is the JP2 is processed because it is higher resolution than the browse.jpg Yep RGB.NOMAP.JP2 is what we want. Thanks for checking
@mschwamb Thanks. I've run into another problem now -- the way the image metadata (i.e. acquisition date, coordinates, etc.) is marked up on the site has changed, so now it's all in one big table cell (instead of each item being in a separate cell). That's going to make it a lot harder to scrape the metadata. Before I start working on that, do we actually need all of that in Ouroboros? Or would it be good enough to just have the original name/download link to match them up later?
I actually have code to scrape meta-data from PDS summary files. So if you give me the column names you need I can provide that.
@michaelaye it might be better to provide @astopy with the code so we've got one pipeline, but that's up to @astopy. I think the P4 scripts are python if I recall correctly.
Unfortunately the scripts are actually Ruby, but @michaelaye if you could share your code that'd be great and I'll find a way to adapt it.
Sorry, didn't see your reply. I still need to know which meta data columns are required, because some of them have several versions flying around and i need to make sure that i point you to the correct index file for that. it's unfortunately not as simple as pointing to a meta-data file for RGB.NOMAP.JP2s, because those don't exist.
@michaelaye should be the following for the non-map image
"coords":[-85.759,106.051],"location":{"standard":"http://www.planetfour.org/subjects/standard/5501938669736d5fdd000000.jpg"},"metadata":{"acquisition_date":"2013-01-03T00:00:00Z","emission_angle":1.0,"lat_centered":-85.759,"lng_east":106.051,"local_mars_time":" 6:01 PM","name":"ESP_030184_0940","north_azimuth":130.0,"phase_angle":68.2,"sub_solar_azimuth": 37.5,"time":null}
@michaelaye There's also a full list here: https://github.com/zooniverse/planet-four/blob/master/data-import/fetch_source_file_and_metadata.rb#L33
I put a gist here: https://gist.github.com/702fff89930de42822d26e14fb182160
In my planetpy tools (pip install planetpy) there's a module called 'pdstools'.
In there I wrote highly generalized PDS index readers, that wrap the Python Parameter Value Language module PVL (The syntax for Planetary Data System label files). Relevant functions are in https://github.com/michaelaye/planetpy/blob/master/planetpy/pdstools.py
But possibly it's just more efficient for you that you just quickly parse the specific data files yourself: Download cumulative RDR index and label file here: http://hirise-pds.lpl.arizona.edu/PDS/INDEX/RDRCUMINDEX.LBL http://hirise-pds.lpl.arizona.edu/PDS/INDEX/RDRCUMINDEX.TAB The label file has the column names for the .TAB file, but in PVL format. The .TAB file is a fixed format text file. You can read out the column names and the column specification (i.e. start and end byte for each column) below, in case you can feed them to a Ruby text parser.
If this is all too much, we also can agree on a format and I can do the parsing for you and provide the results. Meg, I don't remember why we parse so many of those metadata items, if we don't include them in the classification data file? Where are they being used?
@michaelaye that info is in the daily mongo database dumps we get sent. We don't normally get it in csv since it changes only a few times a year and adding the info to each classification would make the database bigger. What columns were chosen was before I formally joined the science team. Given how close we are on the first paper, I think it would be better if you spend your time on that and @astopy tries to adapt your code. Do you agree?
Well, 95% of the work was already done last night, I just had to update my tools to meet standards, so it would have been stupid not to finish it already, in case @astopy decides, that to be efficient he just takes my CSV output. I updated the gist and added a csv file with the metadata for this data run, should all be there.
Great, thanks @michaelaye. Unfortunately I'm pretty much out of time to get this imported into Ouroboros before our Christmas break here, as today's our last working day and I've got some data already being imported for another project that won't finish in time.
I'll be able to finish this up in the first week of January, so hopefully the current data will last until then.
@astopy - thanks. we're good on the live data front. Happy Holidays. Enjoy your much deserved holiday break!
@michaelaye Thanks again. I've integrated your script into the data import tools here: https://github.com/zooniverse/planet-four/blob/master/data-import/fetch_metadata.py
@mschwamb New data should be ready either today or Monday.
@mschwamb The data's in now!
We're preparing for the next set of images we'd like to put on the site. @astopy we'd want uploaded and cut into tiles:
ESP_040246_0935 ESP_039969_0935 ESP_039824_0935 ESP_039547_0935 ESP_039468_0935 ESP_038822_0935 ESP_038625_0930 ESP_038492_0935 ESP_038215_0935 ESP_038149_0935 ESP_038110_0930 ESP_037964_0935 ESP_040311_0940 ESP_040193_0940 ESP_037977_0940 ESP_037976_0940
@astopy when you have time could you run the processing and upload scripts?