ISUgenomics / SequelTools

new repo
GNU General Public License v3.0
26 stars 6 forks source link

error running SequelTools: generateReadLenStats_noScraps.py: command not found #12

Closed alanmejiamaza closed 3 years ago

alanmejiamaza commented 3 years ago

Hi,

I am running some ccs.bam demultiplexed files. I am running in SequelTools in the cluster, on a conda environment with python >3.7. Samtools and R-essential packages were installed.

I executed(no scraps mode): ./SequelTools.sh -t Q -u subtext.txt
Beginning quality control function

Running in NO_SCRAPS mode ./SequelTools.sh: line 676: generateReadLenStats_noScraps.py: command not found ERROR: Calculation of read length statistics failed!

I have tried different bam files, unmultiplexed raw bam and ccs.bam files. It didn't work.

I had a similar issue when running on my Mac Sur.

Any help would be highly appreciated. Happy to share files and see where is the problem. Thanks

alanmejiamaza commented 3 years ago

I am on the user/Scripts.. So dependencies are listed: generateReadLenStats_noScraps.py filterReads.py plotForSequelQC_noScraps.R subsampleReads_noScraps.py ......

aseetharam commented 3 years ago

Hi @alanmejiamaza

Thanks for trying SequelTools. The common reason why you get a command not found error is that your system does not know where to look for the command you issued. One easy way to solve this is to put the scripts in the $PATH Although you can call the script directly using the full path, it will still fail since the main scripts call other scripts as well (which it assumes are in the $PATH). You can put all the scripts in the directory where you are running them or put the scripts folder in the path as follows:

export PATH=$PATH:/location/of/your/scripts/dir

Hope this helps!

Best regards,

alanmejiamaza commented 3 years ago

Thanks. It actually works on Mac but not in the cluster, even though dependencies are installed; not sure why. It is not clear to me why the output is created for 3 (of 4) bam files. Outputs files in the SequelToolsResults folder are 6 only while in the demo were 11 pdf files. I guess one of my ccs.bam files is corrupted.

All plots do not have names assigned and summary table contains "NA" instead of file names. BAM file names are simple: bc1001.bam... bc1004.bam

I do get this error message: Beginning quality control function

Running in NO_SCRAPS mode rm: SequelToolsResults/.readLens.sub.txt: No such file or directory rm: SequelToolsResults/.readLens.longSub.txt: No such file or directory rm: SequelToolsResults/.SMRTcellStats_noScraps.txt: No such file or directory rm: SequelToolsResults/.readLens.sub.txt: No such file or directory rm: SequelToolsResults/.readLens.longSub.txt: No such file or directory rm: SequelToolsResults/.SMRTcellStats_noScraps.txt: No such file or directory SequelTools has finished!

Thanks in advance

aseetharam commented 3 years ago

The input for the SequelTools is either subreads, or scraps or both subreads and scraps. The SequelToolss does not know how to QC ccs reads (processed subreads) at this time, unfortunately. The metadata required for generating QC stats is only in the subreads/scraps files.

alanmejiamaza commented 3 years ago

Got it. Thanks for the inputs.