pyiron / FAQs

General question board for pyiron users
3 stars 0 forks source link

Create a customized pyiron parser #32

Open aageo25 opened 6 months ago

aageo25 commented 6 months ago

Hej.

I am writing a custom pyiron parser based on @jan-janssen's quantum expresso parser notebook. I managed to create my job_class:

pr.create_job_class(
    class_name="CASTEPJob",
    write_input_funct=write_castep_inputs,
    collect_output_funct=collect_output,
    default_input_dict={  # Default Parameter 
        "structure": None, 
        # CASTEP has an on-the-fly PP generator
        #"pseudopotentials": {"Al": "Al.pbe-n-kjpaw_psl.1.0.0.UPF"}, 
        "kpts": (3, 3, 3),
        "task": "singlepoint",
        "smearing": 0.02,
    },
    executable_str="mpirun -m 8 castep.mpi",
)

My workflow is as follows:

job_workflow = pr.wrap_python_function(workflow)
job_workflow.input.project = pr
job_workflow.input.structure = bulk('Al', a=4.05, cubic=True)
job_workflow.run()

The run command raises a TypeError:

TypeError: Error saving /cephyr/NOBACKUP/groups/itaip/users/ageo/projects/ipd/pyiron-castep/castep-parser/test/ (key project__index_0): DataContainer doesn't support saving elements of type "<class 'pyiron_base.project.generic.Project'>" to HDF!
niklassiemer commented 6 months ago

Hi @aageo25,

why do you specify the project as input job_workflow.input.project = pr? The pyiron job should always keep track of its project already. And how is the write_castep_inputs defined? Do you need the project as input? (The error is simply because the Project does not know how to store itself in hdf)

jan-janssen commented 6 months ago

(The error is simply because the Project does not know how to store itself in hdf)

In Python 3.11 Project can use __getstate__() and __setstate__() to store itself. In older Python versions like 3.9 this is not available.

aageo25 commented 6 months ago

why do you specify the project as input job_workflow.input.project = pr?

This was unnecessary in my case. I removed it.

And how is the write_castep_inputs defined? CASTEP requires two input files to run, castep.param and castep.cell. castep.params file contains key-value-pairs that I iterate from a python dict:


def write_param():
""" A function to write a param CASTEP file. The .params file has a simple
structure of a dictionary, with colons as a key-value-pair separator.
    Keyword arguments:
    input_dict -- python dictionary to write in <seedname>.params
    seedname -- identifier for castep .param file
    working_directory -- where CASTEP will run
    """
    os.makedirs(working_directory, exist_ok=True)
    with open(f"{working_directory}/castep.param", 'w') as f:  
        for key, value in input_dict['param'].items():  
            f.write(f'{key}: {value}\n')


The `castep.cell` contains block data, so I use `ASE` to collect the information and write the file. For now the function is working, but I may use another strategy in the future.
aageo25 commented 6 months ago

(The error is simply because the Project does not know how to store itself in hdf)

In Python 3.11 Project can use __getstate__() and __setstate__() to store itself. In older Python versions like 3.9 this is not available.

Thanks @jan-janssen. My current python version is 3.9.

niklassiemer commented 6 months ago

The job should also already have a working_directory attribute which you may use :)

aageo25 commented 6 months ago

I updated Python to 3.11 now and managed to make pyiron execute my custom executable_str. My parse_castep and collect_output functions seem to be working fine, but it is when pyiron stores ASE's Atoms object from the output, I get the following error, now related to a dict:

File [~/projects/ipd/pyiron-castep/newitaip/lib/python3.11/site-packages/pyiron_base/storage/datacontainer.py:804](http://localhost:9001/lab/tree/castep-parser/~/projects/ipd/pyiron-castep/newitaip/lib/python3.11/site-packages/pyiron_base/storage/datacontainer.py#line=803), in DataContainer._to_hdf(self, hdf)
    802             hdf[k] = v
    803         except TypeError:
--> 804             raise TypeError(
    805                 "Error saving {} (key {}): DataContainer doesn't support saving elements "
    806                 'of type "{}" to HDF!'.format(v, k, type(v))
    807             ) from None
    808 for n in hdf.list_nodes() + hdf.list_groups():
    809     if n not in written_keys:
TypeError: Error saving {'atoms': Atoms(symbols='Al4', pbc=True, cell=[4.05, 4.05, 4.05], calculator=Castep(...))} (key cell__index_1): DataContainer doesn't support saving elements of type "<class 'dict'>" to HDF!

@jan-janssen , I am not using ase_to_pyiron function anymore. The storing of an Atoms object for the input is working as it should.

aageo25 commented 6 months ago

The issue happened when we tried to store ASE's Atoms object with a Calculator attached.

atoms.copy() solves the issue since the copy doesn't include the calculator originally attached to the Atoms object.