Closed nsheff closed 1 year ago
TL;DR I think this can be addressed in looper
. Just replace any SAMPLE_NAME_ATTR
occurrences with Project.sample_table_index
I think I know what is going on here. This PEP likely uses the new functionality in PEP spec, which is custom sample table index (not sample_name
). I think it was introduced in the latest, probably still unreleased, PEP spec -- when I go to "latest" docs version in PEP spec page, the "sample table index" docs are missing. They are available here: http://pep.databio.org/en/2.1.0/specification/#sample-table-specification.
So this is not a breaking change to the PEP spec. All PEPs that used sample_name
will still work, as long as we update the software to use Project.sample_table_index
property to determine which Sample
attribute to use to refer to samples (we used sample_name
up to this point).
that makes perfect sense, thanks! Very helpful.
Was able to replace SAMPLE_NAME_ATTR
with Project.sample_table_index
for a few like:
But for other's like the one below, I also receive a dictionary key error: https://github.com/pepkit/looper/blob/df535be2a0bd7257ee37148cb601d688ec557909/looper/conductor.py#L395-L397
looper run project/project_config.yaml Looper version: 1.4.0 Command: run Using default config. No config found in env var: ['DIVCFG'] Pipestat compatible: False Traceback (most recent call last): File "/home/aaronobrien/projects/looper/.venv/lib/python3.10/site-packages/attmap/ordattmap.py", line 46, in __getitem__ return super(OrdAttMap, self).__getitem__(item) KeyError: <property object at 0x7fe62ae222a0>
I'll keep digging to see why. Reading more about sample_table_index
its supposed to default to sample_name
so maybe what's happening is something in the code in passing in something different and therefore not recognizable.
I was able to figure it out. I will PR and commit for review.
@stolarczyk The changelog for
0.32.0
says:This isn't really documented anywhere, but I think these are intended to be parallel to
sample_table
andsubsample_table
in the PEP project_config specification, and define the column name to be used for the identifier attribute. Am I right?This seems to be sort of working, but with looper I get this error:
I guess this is likely a looper bug, and not a peppy bug, right? It seems just that looper isn't adapted to use this new model?
Also this will, really, require an update to to the specification. For now we could mark it as experimental.