Closed japhir closed 3 years ago
KeyError: "'D47_AU002_(2)' is not a valid Parameters name"
That is an error from lmfit
, because lmfit
model parameter names cannot use a number of special characters (because it clashes with the expression of mathematical constraints such as param_a = 1/(param_b+1)
). From here:
the name must match
[a-z_][a-z0-9_]*
and cannot be a Python reserved word.
D47crunch
uses the (crude) function pf()
to solve common sample name problems.
pf()
to handle parentheses. Hacky but easy.Any preference between these two options?
Thanks for the quick reply! Ah, I guess an informative error would suffice here, really. I think it makes total sense not to allow parentheses or other weird characters in the UIDs. I'll just get that stuff out in R and then try again.
Hmm after trying to resolve this, I guess forcing the sample id's to not start with a number might be limiting. I got this to work by replacing all (
, )
, [
, ]
, and +
with _
, but there are also many samples that have ^[a-z0-9_]*
.
Furthermore, it seems like all ETH-1, ETH-2's etc. with their -
are also a problem.
EDIT: hmm that cannot be the problem, as it's now explicitly asking for an ETH-1, ETH-2, and ETH-3.
Could you test whether ed04198dd73368668aa4a1faa389cc085753b55a works for you? It uses a 6-digit hash followed by a sanitized version of the original sample name. I expect it to work well with any arbitrary sample name.
This seems to work! The standardize
step is pretty slow (several minutes for a 552 samples and 1959 anchors) but I have no idea how fast it should be, and computing session ETFs with fancy maths is also very slow in my R version.
It didn't throw an immediate error about the wrong Sample names this time and seems to have worked out okay!
after having solved the first output errors in #5, I immediately ran into another issue when calling
mydata.standardize()
:Note that the lines referring to /tmp/... etc. are there because I'm running python from an orgmode file in emacs.
I'm not sure how to debug this one, so I'll leave it as an issue here! I'm sure it must be something in my dataset (cannot share unfortunately, latest measurements) because I didn't get this error when running the code on your example tiny dataset. My dataset is structured as follows:
I've made sure there are no NAs in there, so I'm a bit confused.
I'm running python 3.9.6 with D47crunch 2.0.2, which in my case relies on lmfit 1.0.2, matplotlib 3.4.3, scipy 1.7.1 and numpy 1.21.2