aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
436 stars 190 forks source link

Docs: Tutorial on creating new data type throws an exception #4867

Open Luthaf opened 3 years ago

Luthaf commented 3 years ago

Describe the current issue

I'm trying to follow https://aiida.readthedocs.io/projects/aiida-core/en/latest/topics/data_types.html#creating-a-data-plugin to understand how to create new data types. I run the following code:

import aiida
from aiida.orm import Data

aiida.load_profile("XXX")

class NewData(Data):
    """A new data type that wraps a single value."""

    def __init__(self, **kwargs):
        value = kwargs.pop("value")
        super().__init__(**kwargs)
        self.set_attribute("value", value)

node = NewData(value=5)
node.set_attribute("value", 6)
node.store()

But I run into the following exception:

Traceback (most recent call last):
  File "tmp.py", line 37, in <module>
    node.store()
  File "<>/virtualenv/lib/python3.7/site-packages/aiida/orm/nodes/node.py", line 1056, in store
    self.validate_storability()
  File "<>/virtualenv/lib/python3.7/site-packages/aiida/orm/nodes/node.py", line 267, in validate_storability
    raise exceptions.StoringNotAllowed(msg)
aiida.common.exceptions.StoringNotAllowed: class `__main__:NewData` does not have registered entry point

Describe the solution you'd like

The tutorial should be runnable without throwing an exceptions

mbercx commented 3 years ago

Hi Luthaf! 👋 Thanks for reporting. The issue is that your new data type doesn't have a registered entry point, as the error message indicates. For more information on entry points, you can have a look here:

https://aiida.readthedocs.io/projects/aiida-core/en/latest/topics/plugins.html?highlight=entry%20point#what-is-an-entry-point

Although the "Topics" section doesn't typically function as a tutorial, the section you linked to definitely should mention entry points and link to the corresponding part of the documentation. It also sort of reads more like a guide to me, so maybe it should be moved to the "How to" section. What do you think, @csadorf?

Luthaf commented 3 years ago

If I read everything correctly, this mean that if I want to create new data types to be used with aiida, I need to create a separate package with setup.py and setuptools, define an entry point there, and install this package somewhere so that setuptools can find it, right?

This feels a bit heavyweight for my use case, but I might be able to do it differently. Is there an easier way to have a workflow step output references to existing data stored in the database (i.e. have a step that takes in 1000 structures, and pick 25 for future calculations)? I could have this step output and store a list of primary keys in the database, and then follow the primary keys as needed, but this feels a bit hacky.

mbercx commented 3 years ago

If I read everything correctly, this mean that if I want to create new data types to be used with aiida, I need to create a separate package with setup.py and setuptools, define an entry point there, and install this package somewhere so that setuptools can find it, right?

Yes, afaik an entry point is required for every data type.

This feels a bit heavyweight for my use case, but I might be able to do it differently. Is there an easier way to have a workflow step output references to existing data stored in the database (i.e. have a step that takes in 1000 structures, and pick 25 for future calculations)? I could have this step output and store a list of primary keys in the database, and then follow the primary keys as needed, but this feels a bit hacky.

Hmm, I might need a bit more information on what you are trying to achieve. Outputting a List node with the UUIDs (I'd use these over the PKs, since they are unique), is possible, but unless this List is a direct input for another work chain/function you'd lose the provenance (and yeah, it feels a little hacky). For a work chain, you could perhaps create an output namespace called structures and assign indices to each output structure. This is similar to what is done for the EquationOfStateWorkChain in the aiida-common-workflows plugin here, where the output structures are then set here.

The outputs will then look something like this when executing verdi process show <PK>:

Outputs         PK    Type
--------------  ----  -------------
structures
    6           9908  StructureData
    5           9904  StructureData
    4           9900  StructureData
    3           9896  StructureData
    2           9892  StructureData
    1           9888  StructureData
    0           9734  StructureData

And you could query for the structures in the verdi shell with e.g.:

In [1]: StructureData = DataFactory('structure')
In [2]: qb = QueryBuilder().append(
   ...:     WorkChainNode, filters={'id': 9703}, tag='eos'
   ...: ).append(
   ...:     StructureData, with_incoming='eos', edge_filters={'label': {'like': 'structure_%'}}
   ...: )

In [3]: qb.all(flat=True)
Out[3]: 
[<StructureData: uuid: 05c5a3e4-c68f-4bde-abe7-476150e16d8b (pk: 9908)>,
 <StructureData: uuid: a3e92f69-f836-423b-9c5d-b433dd1e769e (pk: 9904)>,
 <StructureData: uuid: 81c07e28-a254-4dd7-bd5f-d6c755ff036e (pk: 9900)>,
 <StructureData: uuid: a0741656-3a1c-48ed-8da3-c86e1ad0abd8 (pk: 9896)>,
 <StructureData: uuid: 47e56421-4fb1-4bd3-91a1-11936d2abfdc (pk: 9892)>,
 <StructureData: uuid: e6387931-c97f-4053-9590-84dd32e687a2 (pk: 9888)>,
 <StructureData: uuid: 503d628e-2ccd-4566-a40d-905d67327012 (pk: 9734)>]

Let me know if that suits your purposes.

Luthaf commented 3 years ago

Thanks a lot for the detailed proposition! It looks like it could work for me, I'll give it a try.

mbercx commented 3 years ago

Great, hope it does the job! Will leave the issue open since we should definitely add a note on entry points in the "Create data type plugin" and perhaps move this section to the How to's.

csadorf commented 3 years ago

Although the "Topics" section doesn't typically function as a tutorial, the section you linked to definitely should mention entry points and link to the corresponding part of the documentation. It also sort of reads more like a guide to me, so maybe it should be moved to the "How to" section. What do you think, @csadorf?

Indeed, we might want to move this into the how-to section. However, in this case we of course need to make sure that everything works as expected and without error. The basic assumption is that tutorials outline every step exactly and when followed should work exactly. How-to guides should be more focused and make assumptions about the user's knowledge and omit specifics, but they should still be complete.