MetaCell / scicrunch-antibody-registry

A repo for the SciCrunch antibody registry portal
Apache License 2.0
0 stars 1 forks source link

Import issues #138

Closed masonpairish closed 1 year ago

masonpairish commented 1 year ago
  1. Identity of submitter is not captured - submitter unknown
  2. new submissions should be 'curated' instead of 'queued'
  3. 'make a duplicate' gives duplicates new RRID instead of the already existing RRID
  4. 'remove' keyword does not work as intended (supposed to clear data from a field when doing "update filled columns")
  5. clone id field seems to be ignored
  6. fields are incorrect (missing "GeneID" and "UNIPROT") listed: "NAME, VENDOR, base cat, URL, TARGET, SPECIES, CLONALITY, HOST, clone, ISOTYPE, CONJUGATE, FORM, COMMENTS, CITATION, SUBREGION, MODIFICATION, DISC, TYPE, EPITOPE, CAT ALT, id, ab_id_old, ix", proper: "ab_name, vendor, catalog_num, url, ab_target, target_species, clonality, source_organism, clone_id, product_isotype, product_conjugate, product_form, comments, defining_citation, target_subregion, target_modification, ab_target_entrez_gid, disc_date, commercial_type, uniprot_id, epitope, cat_alt, id, ab_id_old, ix" or "NAME, VENDOR, CAT NUM, URL, TARGET, SPECIES, CLONALITY, HOST, CLONE, ISOTYPE, CONJUGATE, FORM, COMMENTS, CITATION, SUBREGION, MODIFICATION, GeneID, DISC, TYPE, UNIPROT, EPITOPE, CAT ALT, id, ab_id_old, ix"
filippomc commented 1 year ago

6 -- not displayed here: image.png

filippomc commented 1 year ago

the import is failing

afonsobspinto commented 1 year ago

Can you please provide me with more details on why is it failing? cc @filippomc @masonpairish I just tried locally and I was able to both insert and update:

Insert: image image

Update: image

image (there was a bug here because the ascension was wrongly overwritten, fixed in #157 )

masonpairish commented 1 year ago

Errors Line number: 1 - 'NoneType' object has no attribute 'fields' Moesin Antibody, Proteintech, 82009-1-RR, https://www.ptglab.com/Products/MSN-Antibody-82009-1-RR.htm, Moesin, Human, mouse, rat, Recombinant, Rabbit, 3B8, IgG, Unconjugated, Protein A purification; PBS with 0.02% sodium azide and 50% glycerol pH 7.3., Applications: WB, ELISA, , , , 4478, , commercial, , , Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/import_export/resources.py", line 702, in import_row self.before_import_row(row, **kwargs) File "/usr/src/app/./api/resources/AntibodyResource.py", line 202, in before_import_row for field in antibody_identifier.fields: AttributeError: 'NoneType' object has no attribute 'fields'

masonpairish commented 1 year ago

After some testing, it appears the import function checks the names of the headers and they must match or this error occurs. This checking was never done in the old system. So now I can import a csv if I choose to do nothing with it. But if I try to add new or update I get this error: Line number: 1 - get() returned more than one Antigen -- it returned 2! Moesin Antibody, Proteintech, 82009-1-RR, https://www.ptglab.com/Products/MSN-Antibody-82009-1-RR.htm, Moesin, Human, mouse, rat, Recombinant, Rabbit, 3B8, IgG, Unconjugated, Protein A purification; PBS with 0.02% sodium azide and 50% glycerol pH 7.3., Applications: WB, ELISA, , , , 4478, , commercial, , , Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/import_export/resources.py", line 727, in import_row self.import_obj(instance, row, dry_run, kwargs) File "/usr/local/lib/python3.9/site-packages/import_export/resources.py", line 561, in import_obj self.import_field(field, obj, data, kwargs) File "/usr/src/app/./api/resources/AntibodyResource.py", line 222, in import_field field.save(obj, data, is_m2m, kwargs) File "/usr/local/lib/python3.9/site-packages/import_export/fields.py", line 110, in save cleaned = self.clean(data, kwargs) File "/usr/local/lib/python3.9/site-packages/import_export/fields.py", line 66, in clean value = self.widget.clean(value, row=data, kwargs) File "/usr/src/app/./api/widgets/foreign_key_widget.py", line 15, in clean return super().clean(value, row, kwargs) File "/usr/local/lib/python3.9/site-packages/import_export/widgets.py", line 414, in clean return self.get_queryset(value, row, kwargs).get({self.field: val}) File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 653, in get raise self.model.MultipleObjectsReturned( api.models.Antigen.MultipleObjectsReturned: get() returned more than one Antigen -- it returned 2!

filippomc commented 1 year ago

@masonpairish how was the match of the fields done without headers? It would also help having some more files if you can share with us. Also some big files to do some stress testing

masonpairish commented 1 year ago

There was no matching done. The code literally ignored the first row and assumed everything would be in the proper place.

filippomc commented 1 year ago

understand what happens if we remove one header (for instance the ix is not needed for new uploads)