Weeks-UNC / shapemapper2

Public repository for ShapeMapper 2 releases
Other
29 stars 16 forks source link

'RuntimeError: Error: input data lengths do not all match.' in the CalcProfile step #35

Closed eumhh closed 10 months ago

eumhh commented 1 year ago

Hi, I'm trying to run shapemapper on Ubuntu 20.04. Sometimes, not everytime, it returns a RuntimeError in the 'CalcProlife' step. Here is the code I used and the ERROR message:

shapemapper-2.1.3/shapemapper --name sample --target /home/user/reference/myfasta.fasta --out sample --modified --R1 sample2-2_1.fq.gz --R2 sample2-2_2.fq.gz --output-aligned --random-primer-len 9


Running CalcProfile at 2022-08-23 11:23:50 . . .

ERROR: Component "CalcProfile" (RNA: sample) failed, giving the following error message:======= Traceback (most recent call last): File "/home/user/shapemapper-2.1.3/python/pyshapemap/../../bin/make_reactivity_profiles.py", line 396, in raise RuntimeError(s) RuntimeError: Error: input data lengths do not all match.

In what cases does this error return?

Thanks, Eum

Psirving commented 1 year ago

I'm not sure what is causing your error, but it looks like you are using an old version (2.1.3). First, update to the newest version and determine if this issue is still occurring.

eumhh commented 1 year ago

The latest version (2.1.5) still makes same error ;(

ShapeMapper_Error_20220824

eumhh commented 1 year ago

I found the cause and fixed it. There was a space in the fasta header ;( After removing the space, the ShapeMapper execution completed successfully.

Psirving commented 1 year ago

So glad you found the issue! It is frustrating when things fail due to a tiny typo. Could you provide more info about what happened? I might be able to make this error more informative for future users.

eumhh commented 1 year ago

A fasta file with space-starting header returned the RuntimeError.

> Test [there is a space at the beginning of a header line] GATATCGAATTCGGGCAACCTAATACGACTCACTATAGGGACATTTGCTTCTGACACAACT


ERROR: Component "CalcProfile" (RNA: sample) failed, giving the following error message:======= Traceback (most recent call last): File "/home/user/shapemapper-2.1.3/python/pyshapemap/../../bin/make_reactivity_profiles.py", line 396, in raise RuntimeError(s) RuntimeError: Error: input data lengths do not all match.

\

After checking the lines 384~396 in 'make_reactivity_profiles.py', I added a line to figure out the lengths of the inputs.


# check that seq length matches all mutation count data length and depth length
lengths = []
lengths.append(len(seq))
for k in samples:
    if counts[k] is not None:
        lengths.append(counts[k].shape[0])
    if read_depths[k] is not None:
        lengths.append(read_depths[k].shape[0])
    if effective_depths[k] is not None:
        lengths.append(effective_depths[k].shape[0])
print(lengths)    ####### added line to figure out the lengths of the inputs
if len(set(lengths)) > 1:
    s = "Error: input data lengths do not all match."
    raise RuntimeError(s)

As a result, the length of fasta seq was 0. We couldn't suspect fasta file because there were no problems in the alignment steps. But removing space in a header line fixed the error.

>Test [No space at the beginning of a header line] GATATCGAATTCGGGCAACCTAATACGACTCACTATAGGGACATTTGCTTCTGACACAACT

I recommend that we need to double-check input format before use.

Thanks, Eum