Closed ak983819 closed 4 months ago
@ak983819 could I ask the reason for not using the python client?
I am using Jupyter Notebook. Let me know if this was not your question
Have you tried using the python client instead of making direct HTTP requests? https://docs.materialsproject.org/downloading-data/using-the-api/querying-data
Got it! yes I have tried this one also
I dont understand why formula column is empty here in my output. or is there any way to get formula from the Task ID somehow
If you just send requests to /materials/electronic_structure/
you should be able to add formula_pretty
to _fields
while still querying the band gap.
Yes it's working. But Formula name is not there. This is code I'm using right now import requests import pandas as pd base_url = "https://api.materialsproject.org" endpoint = "/materials/electronic_structure/" api_key = "****"
params = { "band_gap_min": 1.5, # Minimum value for the band gap "_fields": "pretty_formula,band_gap,material_id", # Fields to retrieve "_limit": 100, # Adjust based on how many results you want per request "_skip": 0 # For pagination } def fetch_data(params): headers = {"X-API-KEY": api_key} all_data = [] batch_number = 0 # Keep track of how many batches have been fetched while True: response = requests.get(base_url + endpoint, params=params, headers=headers) if response.status_code == 200: data = response.json()["data"] if not data: break # Exit loop if no more data is returned all_data.extend(data) params["_skip"] += params["_limit"] # Prepare for the next batch of data batch_number += 1 print(f"Batch {batch_number} fetched. Total materials fetched: {len(all_data)}") else: print("Error fetching data:", response.status_code) break return all_data data = fetch_data(params)
df = pd.DataFrame(data, columns=["pretty_formula", "material_id", "band_gap"])
df.columns = ["Formula", "material_id", "Band Gap"]
print(df)
This is snap of my output file
or Could you just tell me how to chemical formula from material Id , this will also work for me.
And Thank you so much for quick response.
I really appreciate it!
Try 'formula_pretty' not 'pretty_formula'. Else, query '/materials/core/' using 'material_ids'
Thank you so much! It's working now
I want to extract all the compositions available on the material project with a band gap greater than 1.5 eV. I got the output but in the output file there was empty list of formulas. I got task ID as will but I have no idea how to get the formula from task ID. I have been trying to fix but always output is comping without the formula. Could you please help me extract all the compositions having band gap of more than 1.5 eV?
import requests import pandas as pd
Set the base URL for the Materials Project API
base_url = "https://api.materialsproject.org"
Define the endpoint for retrieving band structure data
endpoint = "/materials/electronic_structure/bandstructure/"
Define your API key
api_key = "" # Replace with your actual API key
Define the parameters for the band gap query
params = { "band_gap_min": 1.5, # Minimum value for the band gap "_fields": "pretty_formula,band_gap,material_id", # Fields to retrieve "_limit": 100, # Adjust based on how many results you want per request "_skip": 0 # For pagination }
Function to fetch data with parameters
def fetch_data(params): headers = {"X-API-KEY": api_key} all_data = [] batch_number = 0 # Keep track of how many batches have been fetched while True: response = requests.get(base_url + endpoint, params=params, headers=headers) if response.status_code == 200: data = response.json()["data"] if not data: break # Exit loop if no more data is returned all_data.extend(data) params["_skip"] += params["_limit"] # Prepare for the next batch of data batch_number += 1 print(f"Batch {batch_number} fetched. Total materials fetched: {len(all_data)}") else: print("Error fetching data:", response.status_code) break return all_data
Fetch all compositions with a band gap above 1.5 eV
data = fetch_data(params)
Convert the list of data to a pandas DataFrame
df = pd.DataFrame(data, columns=["pretty_formula", "task_id", "band_gap"])
Rename columns for clarity
df.columns = ["Formula", "Task ID", "Band Gap"]
print(df)
please help me....