NASA-PDS / peppi

Planetary Data Explorere: Python (PEPPi) library (pds.peppi) to access Planetary Data from the Planetary Data System (formerly known as updart)
https://nasa-pds.github.io/peppi
Apache License 2.0
0 stars 0 forks source link

As a user, I want to transform binary tables (.dat) in CSVs for all members of a collection #60

Open jordanpadams opened 1 week ago

jordanpadams commented 1 week ago

Checked for duplicates

No - I haven't checked

🧑‍🔬 User Persona(s)

Data User

💪 Motivation

...so that I can do my analysis

📖 Additional Details

No response

Acceptance Criteria

Given Collection urn:nasa:pds:mess-rs-raw:data-rsr When I perform a loop through all members of the collection, and use pds4_tools to transform the data products Then I expect a CSV to be generated for all products

⚙️ Engineering Details

From a user:

I want these two data sets as csv files. My computer has some problems with .dat and .xml files. It would be a great help if you help me to download them as .csv files.

First: https://pds-ppi.igpp.ucla.edu/collection/urn:nasa:pds:mess-rs-raw:data-rsr

Second: https://pds-ppi.igpp.ucla.edu/collection/urn:nasa:pds:mess-rs-raw:data-odf

🎉 I&T

No response

jordanpadams commented 2 days ago

here is the code snippet. we just need to add this to a notebook.

import pds.peppi as peppi
import pdr
import os

from urllib.request import urlretrieve

client = peppi.PDSRegistryClient()

products = peppi.Products(client).of_collection("urn:nasa:pds:mess-rs-raw:data-rsr::1.0")
for p in products:
    print(p.properties['lidvid'], p.properties['ops:Label_File_Info.ops:file_ref'], p.properties['ops:Data_File_Info.ops:file_ref'])

    # download label
    local_label_path = p.properties['ops:Label_File_Info.ops:file_ref'][0].split('/')[-1]
    urlretrieve(p.properties['ops:Label_File_Info.ops:file_ref'][0], local_label_path)

    # download product
    local_data_path = p.properties['ops:Data_File_Info.ops:file_ref'][0].split('/')[-1]
    urlretrieve(p.properties['ops:Data_File_Info.ops:file_ref'][0], local_data_path)

    # use Planetary Data Reader to read the data into a pandas DataFrame
    data = pdr.read(os.path.abspath(local_label_path))
    print(data.keys())

    # output pandas DataFrame
    data['MESSENGER_Radio_Science_Receiver_(RSR)_raw_data']

    # convert to csv
    print(f'outputting CSV {local_data_path}.csv')
    data['MESSENGER_Radio_Science_Receiver_(RSR)_raw_data'].to_csv(f'{local_data_path}.csv')
    break
collinss-jpl commented 2 days ago

Thanks @jordanpadams, I was just looking over this code in the email thread and it occurred to me that this code should definitely land in a notebook in the peppi repo, but as a means of addressing this issue: https://github.com/NASA-PDS/peppi/issues/49

@tloubrieu-jpl and I were discussing that issue at the breakout yesterday, and my main blocking issue was the lack of understanding on how "drill down" through the PDS info model (via peppi) to get to concrete data, and this snippet seems like a perfect example. Thomas, do you think creating a notebook out of the code sample above and providing it to Onur would be sufficient to address https://github.com/NASA-PDS/peppi/issues/49 as well?

tloubrieu-jpl commented 2 days ago

Yes, the right place for the notebook is the repository 'search-api-notebook'.

tloubrieu-jpl commented 2 days ago

And right, @collinss-jpl , the notebook implementing Jordan's code would be enough to provide to Onur.