datajoint / datajoint-python

Relational data pipelines for the science lab
https://datajoint.com/docs
GNU Lesser General Public License v2.1
169 stars 84 forks source link

Alter fails on tables with filepath@ and no explicit index when index has not changed #992

Open rly opened 2 years ago

rly commented 2 years ago

Bug Report

Description

If a table has an external store, e.g., the "nwb_file_abs_path" field in the example below, then an INDEX is created on the table for this field. When alter() is called on this table and there have not been any changes to the INDEX part of the table definition (e.g., an attribute is added), the alter fails with the error "'table.alter cannot alter indexes (yet)'" because the definition from the database includes an index and the definition in code does not.

@schema
class Nwbfile(dj.Manual):
    definition = """
    nwb_file_name: varchar(2000)   # name of the NWB file
    ---
    nwb_file_abs_path: filepath@raw
    """

One way to work around this is simply to make the index explicit at the end of the table definition:

@schema
class Nwbfile(dj.Manual):
    definition = """
    nwb_file_name: varchar(2000)   # name of the NWB file
    ---
    nwb_file_abs_path: filepath@raw
    INDEX (nwb_file_abs_path)
    """

Then alter() when there are no changes to the index works as expected.

It would be better if datajoint recognizes that an external store means an index was created and to adjust some part of the following code accordingly so that alter() can work on tables with an external store and no change in the index: https://github.com/datajoint/datajoint-python/blob/master/datajoint/declare.py#L248-L252

Reproducibility

Windows, Python 3.8, DataJoint 0.13.2 and DataJoint Github and master branch

dimitri-yatsenko commented 2 years ago

alter has only been implemented for very simple changes at this point. It is a placeholder for a full-featured implementation. It cannot handle foreign key changes, indexes, etc. We should probably create a single issue that would describe all the cases for the proper alter implementation in one place. Datatypes with external storage create a foreign key to the hidden tables that handle all the external storage.

Another note: I see that you are using varchar(2000) for a primary key attribute. Primary key attributes should use compact data types to be conveniently referenced. If the NWB filename really is the best identifying attribute for these entities, then you could use a uuid made from its hash.

rly commented 2 years ago

Thanks @dimitri-yatsenko . I agree that it would be useful to have a single issue that documents what is and is not possible with this very useful method!

And just to be clear, I am not trying to alter the index, but rather other aspects of the table and because the table happens to have an implicit index from filepath@raw (I think), the alter() command fails with an error. I have updated the title and original post to reflect that.