Goodman-lab / DP5

Python workflow for DP5 and DP4 analysis of organic molecules
Other
173 stars 99 forks source link

DP4 analysis using NMR descriptor file fails #79

Open PeteGierth opened 1 year ago

PeteGierth commented 1 year ago

When supplying an NMR descriptor file rather than raw NMR data, the importing of the proton shifts doesn't work properly- one of the arrays used is just filled with the first shift value. Error is returned, e.g.:

**Hshifts: [6.89, 6.73, 3.81, 3.81, 3.81, 3.6, 3.6, 3.6, 2.3, 2.3, 2.3, 2.0, 2.0, 2.0] Equivalents: [] Omits: []

Calculating DP4 probabilities... Traceback (most recent call last): File "/home/nmr/DP5-master/PyDP4.py", line 843, in main(settings) File "/home/nmr/DP5-master/PyDP4.py", line 529, in main DP4data = DP4.InternalScaling(DP4data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nmr/DP5-master/DP4.py", line 139, in InternalScaling DP4data.Hscaled.append(ScaleNMR(Hshifts, Hexp)) ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nmr/DP5-master/DP4.py", line 151, in ScaleNMR slope, intercept, r_value, p_value, std_err = stats.linregress(expShifts, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nmr/anaconda3/envs/dp5/lib/python3.11/site-packages/scipy/stats/_stats_mstats_common.py", line 157, in linregress raise ValueError("Cannot calculate a linear regression " ValueError: Cannot calculate a linear regression if all x values are identical**

However inspection of the variable that inputs to the regression shows it just contains 6.89,6.89,6.89....

In NMR.py we have :

exp_ind = 0

    for shift ,label in zip( sortedCCalc , sortedClabels):

        if label not in NMRData.Omits:

            ind = tempCCalcs.index(shift)

            assignedCExp[ind] = sortedCExp[exp_ind]

            tempCCalcs[ind] = ''

            exp_ind += 1

    # Proton

    exp_ind = 0

    for shift,label in zip( sortedHCalc,sortedHlabels):

        if label not in NMRData.Omits:

            ind = tempHCalcs.index(shift)

            assignedHExp[ind] = sortedHExp[exp_ind]

            tempHCalcs[ind] = ''

    # update isomers class

    iso.Cexp = assignedCExp
    iso.Hexp = assignedHExp

ie there is no increment of the index for reading the proton shifts here. Inserting exp_ind += 1 after tempHCalcs[ind] = '' solves the problem.

HeheHahaHoHoHo commented 1 month ago

Hi, This reply is coming in super late. You are right. My proton assignment was also bugged out with only 1 shift, as you have mentioned. I have added the exp_ind += 1 to run the DP4 calculations and it runs fine now. Only issue I would mention is that trying equivalent H did not work out now. I'm not sure if it is supported because the list length can be different even though the index is shared. Thanks a lot for OP for the assistance

Cheers