pepkit / peppy

Project metadata manager for PEPs in Python
https://pep.databio.org/peppy
BSD 2-Clause "Simplified" License
37 stars 13 forks source link

`peppy` does not initialize correctly, when there is sample table index name other than default #400

Closed rafalstepien closed 1 month ago

rafalstepien commented 2 years ago

For PEP project which defines sample_table_index other than default (sample_name) the following error is raised:

(databio) cgf8xr@cphg-fqvt2j3:~/databio/repos/pep-nextflow/pseudo_nextflow_task$ bash eido_convert.sh 
Detecting duplicate sample names ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
Traceback (most recent call last):
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/ordattmap.py", line 46, in __getitem__
    return super(OrdAttMap, self).__getitem__(item)
KeyError: 'sample_name'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cgf8xr/databio/venvs/databio/bin/eido", line 8, in <module>
    sys.exit(main())
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/eido/cli.py", line 92, in main
    p = Project(args.pep, sample_table_index=args.st_index)
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 156, in __init__
    self.create_samples(modify=False if self[SAMPLE_TABLE_FILE_KEY] else True)
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 317, in create_samples
    self.modify_samples()
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 486, in modify_samples
    self.attr_merge()
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 723, in attr_merge
    if n not in [s[SAMPLE_NAME_ATTR] for s in self.samples]:
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 723, in <listcomp>
    if n not in [s[SAMPLE_NAME_ATTR] for s in self.samples]:
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/pathex_attmap.py", line 59, in __getitem__
    v = super(PathExAttMap, self).__getitem__(item)
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/ordattmap.py", line 48, in __getitem__
    return AttMap.__getitem__(self, item)
  File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/attmap.py", line 32, in __getitem__
    return self.__dict__[item]
KeyError: 'sample_name'

Because of that I can't use eido convert command sucessfully.

rafalstepien commented 2 years ago

samplesheet.csv subsamplesheet.csv

Since I cannot attach .yaml files below you will find raw config:

pep_version: "2.0.0"
sample_table: "samplesheet.csv"
subsample_table: "subsamplesheet.csv"
sample_table_index: "sample"
subsample_table_index: "sample"
rafalstepien commented 2 years ago

The direct cause of this error is that in peppy.Project initialization attributes self.st_index and self.sst_index are created. Unfortunately in further part of the script hardcoded variables (SAMPLE_NAME_ATTR and SUBSAMPLE_NAME_ATTR) are used instead of newly created attributes.

nsheff commented 2 years ago

probably a relic of the time before this was a variable option

rafalstepien commented 2 years ago

After resolving the first issue, the command eido convert pep_files/config.yaml -f csv finishes with success announcement, but there are warnings:

Could not set subsample_table index. At least one of the requested columns does not exist: sample
Could not set subsample_table index. At least one of the requested columns does not exist: sample

Output file also does not look correctly: converted.csv

khoroshevskyi commented 1 month ago

after running this command:

eido convert /home/bnt4me/virginia/repos/peppy/tests/data/example_peps-master/example_nextflow_subsamples/project_config.yaml --f csv --st-index sample --sst-index sample >> new.csv

From my side eido produces correct csv file: new.csv

Closing this issue