Open keeganq opened 1 year ago
@keeganq, thanks for reporting this issue. This is happening as the API client by default tries to deserialize data into appropriate pymatgen objects. Since you have pymatgen installed but do not have the POTCAR configuration fully functional, it is giving you problems. This is something we are aware of on our end, and are planning a couple of different changes to fix it. For now, the easiest thing to do would be to pass monty_decode=False
to MPRester
alongside your API key. This should disable all deserialization by the client.
Additionally, I have just realized that the latest changes to the TaskDoc
model in emmet-core
have broken pulling task data through the API. I have just pinned emmet-core<=0.50.0
, and have patch released to mp-api==0.30.11
. Before pulling data, I would update your installation of both packages.
Thanks @munrojm! This is looking much better now. I am able to retrieve a TaskDoc
with get_charge_density_from_material_id(<mpid>, inc_task_doc=True)
. Would it be safe to assume that this TaskDoc
is the one that is associated with the calculations used for the volumetric charge density data?
Yup! That is correct. The CHGCAR
is taken from that specific calculation.
An update on this: I was able to configure pymatgen with a local set of POTCAR files, and was previously able to retrieve TaskDocs with monty_decode=True
in MPRester
, as you suggested. These TaskDocs would have decoded objects, specifically TaskDoc.orig_inputs.potcar
would be a list of pymatgen.io.vasp.inputs.PotcarSingle
objects.
Unfortunately, this isn't working after some recent changes to the API. The potcar is instead returned as an emmet Potcar object, i.e. it was not decoded. I think I've identified the problem, and it looks very intentional:
Assuming that this behavior was intended, is there a new recommended way to decode objects in the TaskDoc?
Thanks as always for your help!
I've actually default disabled Monty decoding for the task endpoint while we get a better solution for this. You can instead pass the data to the process decoded method of 'MontyDecoder' to manually decode. Instantiating the 'TaskDoc' with the data as input arguments should also decode any data that isn't nested using monty.
I'm trying to retrieve charge density data, and the corresponding task information for the calculations that produced that data.
I'd like to be able to download the VASP input and output files associated with the volumetric charge density data for some materials.
Version Info
Reproduction
I'm trying to retrieve charge density for materials with
inc_task_doc=True
Produces output:
Full Stack Trace
```python-traceback Retrieving MaterialsDoc documents: 100%|██████████| 1/1 [00:00<00:00, 27413.75it/s] Retrieving ChgcarDataDoc documents: 100%|██████████| 2/2 [00:00<00:00, 60787.01it/s] Retrieving ChgcarDataDoc documents: 100%|██████████| 1/1 [00:00<00:00, 25575.02it/s] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[8], line 2 1 with MPRester("") as mpr: ----> 2 chgcar = mpr.get_charge_density_from_material_id(mpid, inc_task_doc=True) # task=True ?? Look at github 3 # print(chgcar) File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/mprester.py:1101, in MPRester.get_charge_density_from_material_id(self, material_id, inc_task_doc) 1098 raise MPRestError(f"No charge density fetched for {material_id}.") 1100 if inc_task_doc: -> 1101 task_doc = self.tasks.get_data_by_id(latest_doc.task_id) 1102 return chgcar, task_doc 1104 return chgcar File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:839, in BaseRester.get_data_by_id(self, document_id, fields) 836 results = [] # type: List 838 try: --> 839 results = self._query_resource_data(criteria=criteria, fields=fields, suburl=document_id) # type: ignore 840 except MPRestError: 842 if self.primary_key == "material_id": 843 # see if the material_id has changed, perhaps a task_id was supplied 844 # this should likely be re-thought File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:797, in BaseRester._query_resource_data(self, criteria, fields, suburl, use_document_model, timeout) 774 def _query_resource_data( 775 self, 776 criteria: Optional[Dict] = None, (...) 780 timeout: Optional[int] = None, 781 ) -> Union[List[T], List[Dict]]: 782 """ 783 Query the endpoint for a list of documents without associated meta information. Only 784 returns a single page of results. (...) 794 A list of documents 795 """ --> 797 return self._query_resource( # type: ignore 798 criteria=criteria, 799 fields=fields, 800 suburl=suburl, 801 use_document_model=use_document_model, 802 chunk_size=1000, 803 num_chunks=1, 804 ).get("data") File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:295, in BaseRester._query_resource(self, criteria, fields, suburl, use_document_model, parallel_param, num_chunks, chunk_size, timeout) 292 if not url.endswith("/"): 293 url += "/" --> 295 data = self._submit_requests( 296 url=url, 297 criteria=criteria, 298 use_document_model=use_document_model, 299 parallel_param=parallel_param, 300 num_chunks=num_chunks, 301 chunk_size=chunk_size, 302 timeout=timeout, 303 ) 305 return data 307 except RequestException as ex: File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:429, in BaseRester._submit_requests(self, url, criteria, use_document_model, parallel_param, num_chunks, chunk_size, timeout) 425 remaining_docs_avail = {} 427 initial_params_list = [{"url": url, "verify": True, "params": copy(crit)} for crit in new_criteria] --> 429 initial_data_tuples = self._multi_thread(use_document_model, initial_params_list) 431 for data, subtotal, crit_ind in initial_data_tuples: 433 subtotals.append(subtotal) File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:634, in BaseRester._multi_thread(self, use_document_model, params_list, progress_bar, timeout) 630 finished, futures = wait(futures, return_when=FIRST_COMPLETED) 632 for future in finished: --> 634 data, subtotal = future.result() 636 if progress_bar is not None: 637 progress_bar.update(len(data["data"])) File ~/.conda/envs/materials-project/lib/python3.9/concurrent/futures/_base.py:439, in Future.result(self, timeout) 437 raise CancelledError() 438 elif self._state == FINISHED: --> 439 return self.__get_result() 441 self._condition.wait(timeout) 443 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]: File ~/.conda/envs/materials-project/lib/python3.9/concurrent/futures/_base.py:391, in Future.__get_result(self) 389 if self._exception: 390 try: --> 391 raise self._exception 392 finally: 393 # Break a reference cycle with the exception in self._exception 394 self = None File ~/.conda/envs/materials-project/lib/python3.9/concurrent/futures/thread.py:58, in _WorkItem.run(self) 55 return 57 try: ---> 58 result = self.fn(*self.args, **self.kwargs) 59 except BaseException as exc: 60 self.future.set_exception(exc) File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:685, in BaseRester._submit_request_and_process(self, url, verify, params, use_document_model, timeout) 682 if response.status_code == 200: 684 if self.monty_decode: --> 685 data = json.loads(response.text, cls=MontyDecoder) 686 else: 687 data = json.loads(response.text) File ~/.conda/envs/materials-project/lib/python3.9/json/__init__.py:359, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 357 if parse_constant is not None: 358 kw['parse_constant'] = parse_constant --> 359 return cls(**kw).decode(s) File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:475, in MontyDecoder.decode(self, s) 473 else: 474 d = json.loads(s) --> 475 return self.process_decoded(d) File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in MontyDecoder.process_decoded(self, d) 451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId": 452 return bson.objectid.ObjectId(d["oid"]) --> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()} 456 if isinstance(d, list): 457 return [self.process_decoded(x) for x in d] File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, inLooking through the stack trace, it looks like the api is trying to retrieve the docs associated with the latest task, and is unable to locate some vasp files for that task.
To get around this issue, I also tried retrieving ALL of the download information for this material:
And received the output (task docs metadata, task NOMAD url where it exists):
So I can get the task info this way, it's not clear which of these calculations is associated with the material's charge density data.
The two questions I have:
get_charge_density_from_material_id
method a bug?Thanks for any help you can offer!