ventolab / CellphoneDB

CellPhoneDB can be used to search for a particular ligand/receptor, or interrogate your own HUMAN single-cell transcriptomics data.
https://www.cellphonedb.org/
MIT License
308 stars 51 forks source link

cellphonedb database generate fails with --user-interactions-only #71

Closed ZheFrench closed 1 year ago

ZheFrench commented 1 year ago

What should be the input format (colnames) to create its own database with a specific set of interactions user defined with --user-interactions ?

This works but add stuffs... cellphonedb database generate --user-interactions "unique.csv" --result-path "/data/cellphoneCustom/"

This doesn"t work.

cellphonedb database generate --user-interactions "unique.csv" --user-interactions-only --result-path "/data/cellphoneCustom/"

my input is looks like in both cases.

partner_a,partner_b A2M,LRP1 AANAT,MTNR1A AANAT,MTNR1B

-------------
Generate database input files:
    - Interactions: /data/villemin/data2/villemin/andrei_spatial/LRdb.unique.csv
-------------
read local uniprot file
[ ][APP][18/10/22-21:26:40][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:26:41][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
read local ensembl file
read local uniprot file
/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/core/generators/gene_generator.py:31: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ensembl_db_filtered.dropna(inplace=True)
[ ][APP][18/10/22-21:26:53][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:26:53][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:26:53][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:26:53][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
Traceback (most recent call last):
  File "/data/villemin/anaconda3/envs/cellphone/bin/cellphonedb", line 8, in <module>
    sys.exit(cli())
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/database_terminal_api_endpoints/database_terminal_commands.py", line 118, in generate
    release=release
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/tools_terminal_api_endpoints/tools_terminal_commands.py", line 220, in generate_interactions
    result[result_columns].sort_values(['partner_a', 'partner_b']).to_csv(
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/frame.py", line 2912, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/indexing.py", line 1254, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/indexing.py", line 1304, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['source'] not in index"

Ok so i switch to

partner_a,partner_b,source
A2M,LRP1,curated
AANAT,MTNR1A,curated
AANAT,MTNR1B,curated
ACE,AGTR2,curated

Still this works. cellphonedb database generate --user-interactions "unique.csv" --result-path "/data/cellphoneCustom/"

This doesn't work again :/

cellphonedb database generate --user-interactions "unique.csv" --user-interactions-only --result-path "/data/cellphoneCustom/"

-------------
Generate database input files:
    - Interactions: /data/villemin/data2/villemin/andrei_spatial/LRdb.unique.source.csv
-------------
read local uniprot file
[ ][APP][18/10/22-21:37:10][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:37:11][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
read local ensembl file
read local uniprot file
/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/core/generators/gene_generator.py:31: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ensembl_db_filtered.dropna(inplace=True)
[ ][APP][18/10/22-21:37:22][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:37:22][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:37:22][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:37:22][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:37:23][WARNING] Output directory (/data/villemin/data2/villemin/andrei_spatial/cellphoneCustom/) exist and is not empty. Result can overwrite old results
[ ][APP][18/10/22-21:37:23][WARNING] There are some proteins or complexes not interacting properly: `A2M, AANAT, ABCA1, ACE, ACKR1, ACKR2, ACKR3, ACKR4, ACTR2, ACVR1, ACVR1B, ACVR1C, ACVR2A, ACVR2B, ACVRL1, ADA, ADAM10, ADAM12, ADAM15, ADAM17, ADAM2, ADAM23, ADAM28, ADAM9, ADCY1, ADCY7, ADCY8, ADCY9, ADCYAP1, ADCYAP1R1, ADGRB2, ADGRE2, ADGRE5, ADGRG1, ADGRL1, ADGRL2, ADGRL4, ADIPOQ, ADM, ADM2, ADORA1, ADORA2B, ADRA2A, ADRA2B, ADRB2, ADRB3, AGR2, AGRN, AGRP, AGT, AGTR1, AGTR2, AHSG, ALB, ALCAM, ALK, ALOX5, AMELX, AMELY, AMFR, AMH, ANGPT1, ANGPT2, ANGPT4, ANGPTL1, ANGPTL2, ANGPTL3, ANGPTL4, ANOS1, ANXA1, APCDD1, APLN, APLP2, APOA1, APOA2, APOA4, APOB, APOC1, APOC2, APOC3, APOC4, APOD, APOE, APP, AQP1, AQP6, AREG, ARF1, ARPC5, ART1, ARTN, ASGR1, ASGR2, ASIC3, ASIP, ATP6AP2, AVP, AVPR1A, AVPR1B, AVPR2, AXL, AZGP1, B2M, BAMBI, BCAM, BCAN, BDKRB1, BDKRB2, ...
 ...
TACR3, TMEM67, TFR2, TRADD, TNFSF10, TP53, TNFSF11, TNFSF12, TNFSF13B, TNFSF18, TNFSF15, TNFSF8, TRHR, UTS2R`
[ ][CORE][18/10/22-21:37:23][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][18/10/22-21:37:23][INFO] Using custom database at /data2/USERS/villemin/andrei_spatial/cellphoneCustom/cellphonedb_user_2022-10-18-21_37.db
[ ][APP][18/10/22-21:37:23][INFO] Collecting protein
[ ][CORE][18/10/22-21:37:23][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][18/10/22-21:37:23][INFO] Using custom database at /data2/USERS/villemin/andrei_spatial/cellphoneCustom/cellphonedb_user_2022-10-18-21_37.db
[ ][CORE][18/10/22-21:37:23][INFO] Collecting protein
[ ][APP][18/10/22-21:37:23][INFO] Collecting gene
[ ][CORE][18/10/22-21:37:23][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][18/10/22-21:37:23][INFO] Using custom database at /data2/USERS/villemin/andrei_spatial/cellphoneCustom/cellphonedb_user_2022-10-18-21_37.db
[ ][CORE][18/10/22-21:37:23][INFO] Collecting gene
[ ][APP][18/10/22-21:37:23][INFO] Collecting complex
[ ][CORE][18/10/22-21:37:23][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][18/10/22-21:37:23][INFO] Using custom database at /data2/USERS/villemin/andrei_spatial/cellphoneCustom/cellphonedb_user_2022-10-18-21_37.db
[ ][CORE][18/10/22-21:37:23][INFO] Collecting complex
[ ][APP][18/10/22-21:37:24][INFO] Collecting interaction
[ ][CORE][18/10/22-21:37:24][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][18/10/22-21:37:24][INFO] Using custom database at /data2/USERS/villemin/andrei_spatial/cellphoneCustom/cellphonedb_user_2022-10-18-21_37.db
[ ][CORE][18/10/22-21:37:24][INFO] Collecting interaction
Traceback (most recent call last):
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'id_cp_interaction'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/generic.py", line 3574, in _set_item
    loc = self._info_axis.get_loc(key)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
    raise KeyError(key) from err
KeyError: 'id_cp_interaction'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/villemin/anaconda3/envs/cellphone/bin/cellphonedb", line 8, in <module>
    sys.exit(cli())
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/database_terminal_api_endpoints/database_terminal_commands.py", line 130, in generate
    data_path=output_path)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/database/manager/DatabaseVersionManager.py", line 90, in collect_database
    LocalCollectorLauncher(database_file_path).all(**kwargs)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/local_launchers/local_collector_launcher.py", line 34, in all
    self.interaction(interaction_filename, data_path)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/local_launchers/local_collector_launcher.py", line 24, in wrapper
    getattr(create_app(True, self.database_file, True).collect, method_name)(data)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/core/collectors/collector.py", line 40, in interaction
    interactions_processed = interaction_preprocess_collector.call(interactions, multidatas)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/core/collectors/interaction_preprocess_collector.py", line 8, in call
    interactions_processed = _set_interactor_property(interactions, multidatas)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/cellphonedb/src/core/collectors/interaction_preprocess_collector.py", line 25, in _set_interactor_property
    lambda interaction: unique_id_generator.interaction(interaction, ('_x', '_y')), axis=1)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/frame.py", line 3044, in __setitem__
    self._set_item(key, value)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/frame.py", line 3121, in _set_item
    NDFrame._set_item(self, key, value)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/generic.py", line 3577, in _set_item
    self._mgr.insert(len(self._info_axis), key, value)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 1189, in insert
    block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/internals/blocks.py", line 2722, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/internals/blocks.py", line 2378, in __init__
    super().__init__(values, ndim=ndim, placement=placement)
  File "/data/villemin/anaconda3/envs/cellphone/lib/python3.7/site-packages/pandas/core/internals/blocks.py", line 131, in __init__
    f"Wrong number of items passed {len(self.values)}, "
ValueError: Wrong number of items passed 56, placement implies 1
luzgaral commented 1 year ago

Hi @ZheFrench

Apologies fro the late reply.

The input--user-interactions "unique.csv" has some more mandatory fields: "partner_a"; "partner_b"; "annotation_strategy"; "source". Optional fields are : "protein_name_a"; "protein_name_b"

Find here the documentation.

Find here an example format with all the fields.

Hope this helps.

Best,

Luz