Closed chuckzzzz closed 7 months ago
Hi,
I had this same situation. I downloaded SRR21492154 data and the result is here: Total 98363542 reads are processed. Time elapse: 22 : 22 : 30.80 Detecting rate: 12.89%
Result counting: Number of 3'-adaptor located on the read head region: 2831310 Number of 3'-adaptor + polyT on the read head region: 439768 Number of 3'-adaptor located on the read tail region: 9843346 Number of 3'-adaptor + polyT on the read tail region: 1827523
Alignment counting: Number of 3'-adaptor having no mismatch: 938
Number of 3'-adaptor having mismatch at the last one position: 592898
Number of 3'-adaptor having mismatch at all the last two position: 201109
Number of 3'-adaptor having mismatch at all the last three position: 96201
Number of 3'-adaptor having in/del at the last one position: 22676
Number of 3'-adaptor having in/del at the last two position: 5291525
Number of 3'-adaptor having in/del at the last three position: 6741530
Number of rescued truncated 3'-adaptor on the read head region: 6423
Number of rescued truncated 3'-adaptor on the read tail region: 309789
Finish time stamp: Sat, 16 Mar 2024 09:45:26
I guess it is the first and last 100 nucleotides of reads to recognize TruSeq Read 1 and PolyA. maybe I don't understand very well. Very thankful for the further explanations!
thanks a lot.
Hi,
Thank you for reporting the issues. However, I couldn't reproduce your low detection rate results. Could you provide me more details about your environments ?
The virtual environment I'm using:
(test_env) production ~/data/test/test_test $ python3 --version
Python 3.9.19
(test_env) ~/data/test/test_test $ pip3 install -r scNanoGPS/requirements.txt
Requirement already satisfied: biopython in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 1)) (1.83)
Requirement already satisfied: distance in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 2)) (0.1.3)
Requirement already satisfied: liqa in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 3)) (1.3.4)
Requirement already satisfied: matplotlib in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 4)) (3.8.3)
Requirement already satisfied: pandas in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 5)) (2.2.1)
Requirement already satisfied: pysam in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 6)) (0.22.0)
Requirement already satisfied: seaborn in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 7)) (0.13.2)
Requirement already satisfied: numpy in /anaconda3/envs/test_env/lib/python3.9/site-packages (from biopython->-r scNanoGPS/requirements.txt (line 1)) (1.26.4)
Requirement already satisfied: lifelines in /anaconda3/envs/test_env/lib/python3.9/site-packages (from liqa->-r scNanoGPS/requirements.txt (line 3)) (0.28.0)
Requirement already satisfied: contourpy>=1.0.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (4.50.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (24.0)
Requirement already satisfied: pillow>=8 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (10.2.0)
Requirement already satisfied: pyparsing>=2.3.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (3.1.2)
Requirement already satisfied: python-dateutil>=2.7 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (2.9.0.post0)
Requirement already satisfied: importlib-resources>=3.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (6.4.0)
Requirement already satisfied: pytz>=2020.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from pandas->-r scNanoGPS/requirements.txt (line 5)) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from pandas->-r scNanoGPS/requirements.txt (line 5)) (2024.1)
Requirement already satisfied: zipp>=3.1.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from importlib-resources>=3.2.0->matplotlib->-r scNanoGPS/requirements.txt (line 4)) (3.18.1)
Requirement already satisfied: six>=1.5 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib->-r scNanoGPS/requirements.txt (line 4)) (1.16.0)
Requirement already satisfied: scipy>=1.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.12.0)
Requirement already satisfied: autograd>=1.5 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.6.2)
Requirement already satisfied: autograd-gamma>=0.3 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (0.5.0)
Requirement already satisfied: formulaic>=0.2.2 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.0.1)
Requirement already satisfied: future>=0.15.2 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from autograd>=1.5->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.0.0)
Requirement already satisfied: interface-meta>=1.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from formulaic>=0.2.2->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.3.0)
Requirement already satisfied: typing-extensions>=4.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from formulaic>=0.2.2->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (4.10.0)
Requirement already satisfied: wrapt>=1.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from formulaic>=0.2.2->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.16.0)
The scanner result I tried with SRR21492154
(test_env) ~/data/test/test_test $ cat scNanoGPS_res.3.9/scanner.log.txt
Starting time stamp: Fri, 22 Mar 2024 18:32:44
List of parameters:
Current working directory: ~/data/test/test_test
Input file name: source/SRR21492154.fastq.gz
Output FastQ file name: scNanoGPS_res.3.9/processed.fastq.gz
Output barcode list name: scNanoGPS_res.3.9/barcode_list.tsv.gz
Log file name: scNanoGPS_res.3.9/scanner.log.txt
Parameters for pattern search:
Length of barcode: 16
Length of UMI: 12
5'-adaptor sequence: AAGCAGTGGTATCAACGCAGAGTACAT
3'-adaptor sequence: CTACACGACGCTCTTCCGATCT
PolyT sequence: TTTTTTTTTTTT
Scanning region length: 100
Penalty for dynamic programming:
Matching: 2
Mismatching: -3
Gap opening: -5
Gap extension: -2
Editing distance: 2
Parameters for computing:
Number of computer cores: 10
Number of reads per batch job: 1000
Minimal length of read: 200
Matching threshold: 0.7
Scoring threshold: 0.4
Debug mode switch: False
Total 98363542 reads are processed.
Time elapse: 19 : 2 : 43.55
Detecting rate: 78.47%
Result counting:
Number of 3'-adaptor located on the read head region: 37205573
Number of 3'-adaptor + polyT on the read head region: 36773661
Number of 3'-adaptor located on the read tail region: 39983822
Number of 3'-adaptor + polyT on the read tail region: 39217213
Alignment counting:
Number of 3'-adaptor having no mismatch: 31524450
Number of 3'-adaptor having mismatch at the last one position: 4132090
Number of 3'-adaptor having mismatch at all the last two position: 2398869
Number of 3'-adaptor having mismatch at all the last three position: 1009858
Number of 3'-adaptor having in/del at the last one position: 1144
Number of 3'-adaptor having in/del at the last two position: 1049
Number of 3'-adaptor having in/del at the last three position: 955
Number of rescued truncated 3'-adaptor on the read head region: 120924
Number of rescued truncated 3'-adaptor on the read tail region: 10598727
Finish time stamp: Sat, 23 Mar 2024 13:35:28
I also tried to construct python 3.11 environment, and ran scanner with SRR21492154. The scanner log result:
(test_env_3.11) ~/data/test/test_test $ cat scNanoGPS_res.3.11/scanner.log.txt
Starting time stamp: Fri, 22 Mar 2024 18:33:29
List of parameters:
Current working directory: ~/data/test/test_test
Input file name: source/SRR21492154.fastq.gz
Output FastQ file name: scNanoGPS_res.3.11/processed.fastq.gz
Output barcode list name: scNanoGPS_res.3.11/barcode_list.tsv.gz
Log file name: scNanoGPS_res.3.11/scanner.log.txt
Parameters for pattern search:
Length of barcode: 16
Length of UMI: 12
5'-adaptor sequence: AAGCAGTGGTATCAACGCAGAGTACAT
3'-adaptor sequence: CTACACGACGCTCTTCCGATCT
PolyT sequence: TTTTTTTTTTTT
Scanning region length: 100
Penalty for dynamic programming:
Matching: 2
Mismatching: -3
Gap opening: -5
Gap extension: -2
Editing distance: 2
Parameters for computing:
Number of computer cores: 10
Number of reads per batch job: 1000
Minimal length of read: 200
Matching threshold: 0.7
Scoring threshold: 0.4
Debug mode switch: False
Total 98363542 reads are processed.
Time elapse: 18 : 17 : 58.16
Detecting rate: 78.47%
Result counting:
Number of 3'-adaptor located on the read head region: 37205573
Number of 3'-adaptor + polyT on the read head region: 36773661
Number of 3'-adaptor located on the read tail region: 39983822
Number of 3'-adaptor + polyT on the read tail region: 39217213
Alignment counting:
Number of 3'-adaptor having no mismatch: 31524450
Number of 3'-adaptor having mismatch at the last one position: 4132090
Number of 3'-adaptor having mismatch at all the last two position: 2398869
Number of 3'-adaptor having mismatch at all the last three position: 1009858
Number of 3'-adaptor having in/del at the last one position: 1144
Number of 3'-adaptor having in/del at the last two position: 1049
Number of 3'-adaptor having in/del at the last three position: 955
Number of rescued truncated 3'-adaptor on the read head region: 120924
Number of rescued truncated 3'-adaptor on the read tail region: 10598727
Finish time stamp: Sat, 23 Mar 2024 12:51:28
Please share me more details to identify how the issue occurred. Thank you.
Regards, Cheng-Kai
The answers for your questions:
"Specifically, what does the detection rate mean? Is it only 13.28% reads have valid adaptors and the reads that don't are discarded?"
Detection rate means the proportion of reads which have identified to have TruSeq1 and TSO as mentioned in our paper.
"I guess it is the first and last 100 nucleotides of reads to recognize TruSeq Read 1 and PolyA. maybe I don't understand very well."
The scanner scans the first 100 nucleotide and the last 100 nucleotide for TruSeq R1/PolyA and TSO. In the scanner, I used the terms 3'-adaptor for TruSeq R1 and 5'-adaptor for TSO simply because the TruSeq R1 is ligated on the 3' tail of mRNA while TSO is ligated on 5' tail. It's rare case that the sequencer produce unknown sequence outside the adaptor pairs. And you can change the scanning region by using the parameter "--scanning_region"
Hope this helps.
Regards, Cheng-Kai
Hi,
Thank you for reporting the issues. However, I couldn't reproduce your low detection rate results. Could you provide me more details about your environments ?
The virtual environment I'm using:
(test_env) production ~/data/test/test_test $ python3 --version Python 3.9.19 (test_env) ~/data/test/test_test $ pip3 install -r scNanoGPS/requirements.txt Requirement already satisfied: biopython in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 1)) (1.83) Requirement already satisfied: distance in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 2)) (0.1.3) Requirement already satisfied: liqa in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 3)) (1.3.4) Requirement already satisfied: matplotlib in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 4)) (3.8.3) Requirement already satisfied: pandas in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 5)) (2.2.1) Requirement already satisfied: pysam in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 6)) (0.22.0) Requirement already satisfied: seaborn in /anaconda3/envs/test_env/lib/python3.9/site-packages (from -r scNanoGPS/requirements.txt (line 7)) (0.13.2) Requirement already satisfied: numpy in /anaconda3/envs/test_env/lib/python3.9/site-packages (from biopython->-r scNanoGPS/requirements.txt (line 1)) (1.26.4) Requirement already satisfied: lifelines in /anaconda3/envs/test_env/lib/python3.9/site-packages (from liqa->-r scNanoGPS/requirements.txt (line 3)) (0.28.0) Requirement already satisfied: contourpy>=1.0.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (1.2.0) Requirement already satisfied: cycler>=0.10 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (4.50.0) Requirement already satisfied: kiwisolver>=1.3.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (1.4.5) Requirement already satisfied: packaging>=20.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (24.0) Requirement already satisfied: pillow>=8 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (10.2.0) Requirement already satisfied: pyparsing>=2.3.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (2.9.0.post0) Requirement already satisfied: importlib-resources>=3.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from matplotlib->-r scNanoGPS/requirements.txt (line 4)) (6.4.0) Requirement already satisfied: pytz>=2020.1 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from pandas->-r scNanoGPS/requirements.txt (line 5)) (2024.1) Requirement already satisfied: tzdata>=2022.7 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from pandas->-r scNanoGPS/requirements.txt (line 5)) (2024.1) Requirement already satisfied: zipp>=3.1.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from importlib-resources>=3.2.0->matplotlib->-r scNanoGPS/requirements.txt (line 4)) (3.18.1) Requirement already satisfied: six>=1.5 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib->-r scNanoGPS/requirements.txt (line 4)) (1.16.0) Requirement already satisfied: scipy>=1.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.12.0) Requirement already satisfied: autograd>=1.5 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.6.2) Requirement already satisfied: autograd-gamma>=0.3 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (0.5.0) Requirement already satisfied: formulaic>=0.2.2 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.0.1) Requirement already satisfied: future>=0.15.2 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from autograd>=1.5->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.0.0) Requirement already satisfied: interface-meta>=1.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from formulaic>=0.2.2->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.3.0) Requirement already satisfied: typing-extensions>=4.2.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from formulaic>=0.2.2->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (4.10.0) Requirement already satisfied: wrapt>=1.0 in /anaconda3/envs/test_env/lib/python3.9/site-packages (from formulaic>=0.2.2->lifelines->liqa->-r scNanoGPS/requirements.txt (line 3)) (1.16.0)
The scanner result I tried with SRR21492154
(test_env) ~/data/test/test_test $ cat scNanoGPS_res.3.9/scanner.log.txt Starting time stamp: Fri, 22 Mar 2024 18:32:44 List of parameters: Current working directory: ~/data/test/test_test Input file name: source/SRR21492154.fastq.gz Output FastQ file name: scNanoGPS_res.3.9/processed.fastq.gz Output barcode list name: scNanoGPS_res.3.9/barcode_list.tsv.gz Log file name: scNanoGPS_res.3.9/scanner.log.txt Parameters for pattern search: Length of barcode: 16 Length of UMI: 12 5'-adaptor sequence: AAGCAGTGGTATCAACGCAGAGTACAT 3'-adaptor sequence: CTACACGACGCTCTTCCGATCT PolyT sequence: TTTTTTTTTTTT Scanning region length: 100 Penalty for dynamic programming: Matching: 2 Mismatching: -3 Gap opening: -5 Gap extension: -2 Editing distance: 2 Parameters for computing: Number of computer cores: 10 Number of reads per batch job: 1000 Minimal length of read: 200 Matching threshold: 0.7 Scoring threshold: 0.4 Debug mode switch: False Total 98363542 reads are processed. Time elapse: 19 : 2 : 43.55 Detecting rate: 78.47% Result counting: Number of 3'-adaptor located on the read head region: 37205573 Number of 3'-adaptor + polyT on the read head region: 36773661 Number of 3'-adaptor located on the read tail region: 39983822 Number of 3'-adaptor + polyT on the read tail region: 39217213 Alignment counting: Number of 3'-adaptor having no mismatch: 31524450 Number of 3'-adaptor having mismatch at the last one position: 4132090 Number of 3'-adaptor having mismatch at all the last two position: 2398869 Number of 3'-adaptor having mismatch at all the last three position: 1009858 Number of 3'-adaptor having in/del at the last one position: 1144 Number of 3'-adaptor having in/del at the last two position: 1049 Number of 3'-adaptor having in/del at the last three position: 955 Number of rescued truncated 3'-adaptor on the read head region: 120924 Number of rescued truncated 3'-adaptor on the read tail region: 10598727 Finish time stamp: Sat, 23 Mar 2024 13:35:28
I also tried to construct python 3.11 environment, and ran scanner with SRR21492154. The scanner log result:
(test_env_3.11) ~/data/test/test_test $ cat scNanoGPS_res.3.11/scanner.log.txt Starting time stamp: Fri, 22 Mar 2024 18:33:29 List of parameters: Current working directory: ~/data/test/test_test Input file name: source/SRR21492154.fastq.gz Output FastQ file name: scNanoGPS_res.3.11/processed.fastq.gz Output barcode list name: scNanoGPS_res.3.11/barcode_list.tsv.gz Log file name: scNanoGPS_res.3.11/scanner.log.txt Parameters for pattern search: Length of barcode: 16 Length of UMI: 12 5'-adaptor sequence: AAGCAGTGGTATCAACGCAGAGTACAT 3'-adaptor sequence: CTACACGACGCTCTTCCGATCT PolyT sequence: TTTTTTTTTTTT Scanning region length: 100 Penalty for dynamic programming: Matching: 2 Mismatching: -3 Gap opening: -5 Gap extension: -2 Editing distance: 2 Parameters for computing: Number of computer cores: 10 Number of reads per batch job: 1000 Minimal length of read: 200 Matching threshold: 0.7 Scoring threshold: 0.4 Debug mode switch: False Total 98363542 reads are processed. Time elapse: 18 : 17 : 58.16 Detecting rate: 78.47% Result counting: Number of 3'-adaptor located on the read head region: 37205573 Number of 3'-adaptor + polyT on the read head region: 36773661 Number of 3'-adaptor located on the read tail region: 39983822 Number of 3'-adaptor + polyT on the read tail region: 39217213 Alignment counting: Number of 3'-adaptor having no mismatch: 31524450 Number of 3'-adaptor having mismatch at the last one position: 4132090 Number of 3'-adaptor having mismatch at all the last two position: 2398869 Number of 3'-adaptor having mismatch at all the last three position: 1009858 Number of 3'-adaptor having in/del at the last one position: 1144 Number of 3'-adaptor having in/del at the last two position: 1049 Number of 3'-adaptor having in/del at the last three position: 955 Number of rescued truncated 3'-adaptor on the read head region: 120924 Number of rescued truncated 3'-adaptor on the read tail region: 10598727 Finish time stamp: Sat, 23 Mar 2024 12:51:28
Please share me more details to identify how the issue occurred. Thank you.
Regards, Cheng-Kai
Hi Cheng-Kai,
Thanks very much for your helpful reply!
My python environment is Python 3.7.12, and I just install required libraries and tools within this virtual environment. The scanner.py are run subsequently, generating above low detection rate of adaptors at the both ends of sequences. When I ran example fastq data , while the similar detecion rate occurred again:
Total 7731 reads are processed. Time elapse: 0 : 0 : 8.77 Detecting rate: 11.77%
Here I just used the example code. May I ask is it because the python environment or libraries/tools installed wrongly? The output '*minimap2.bam' are empty if I use this low detection rate 'processed.fastq.gz', which confused me a lot.
I would appreciate it if you could give me some advice to deal with it. Thanks a lot.
Regards, Lily
Hi Lily,
I developed the pipeline based on Python 3.9, and I never try older python version. To my knowledge, I believe python 3.7 might use older version biopython and pysam as well which I cannot guarantee that my code could work properly. Please create a new conda environment with:
conda create -n <new_env_name> python=3.9 numpy scipy
Please update anaconda/miniconda if necessary.
The 11.77% detecting rate is too low, which means that only 11.77% of reads are detected to have TruSeqR1+CellBarcode+UMI. This result is way too low and doesn't make sense.
According to your trial of SRR21492154, I believe there's something wrong about conda environment. Please setting up new environment and making sure you can obtain above 70% detection rate from SRR21492154. Then please run your data again. Hope this helps.
Regards, Cheng-Kai
Hi Cheng-Kai,
Thanks very much for your helpful reply!
My python environment is Python 3.7.12, and I just install required libraries and tools within this virtual environment. The scanner.py are run subsequently, generating above low detection rate of adaptors at the both ends of sequences. When I ran example fastq data , while the similar detecion rate occurred again:
Total 7731 reads are processed. Time elapse: 0 : 0 : 8.77 Detecting rate: 11.77%
Here I just used the example code. May I ask is it because the python environment or libraries/tools installed wrongly?
According to your trial of SRR21492154, I believe there's something wrong about conda environment.
The output '*minimap2.bam' are empty if I use this low detection rate 'processed.fastq.gz', which confused me a lot.
The processed.fastq.gz stores the reads after removal of TruSeq R1, cell barcode, UMI, TSO. Empty *.minimap2.bam files are resulting from library/tool/environment because the 2nd assigner is designed to filter out ambients. So there should be no way to generate empty bam files.
I would appreciate it if you could give me some advice to deal with it. Thanks a lot.
Regards, Lily
Hi Lily,
I developed the pipeline based on Python 3.9, and I never try older python version. To my knowledge, I believe python 3.7 might use older version biopython and pysam as well which I cannot guarantee that my code could work properly. Please create a new conda environment with:
conda create -n <new_env_name> python=3.9 numpy scipy
Please update anaconda/miniconda if necessary.
The 11.77% detecting rate is too low, which means that only 11.77% of reads are detected to have TruSeqR1+CellBarcode+UMI. This result is way too low and doesn't make sense.
According to your trial of SRR21492154, I believe there's something wrong about conda environment. Please setting up new environment and making sure you can obtain above 70% detection rate from SRR21492154. Then please run your data again. Hope this helps.
Regards, Cheng-Kai
Hi Cheng-Kai, Thanks very much for your helpful reply! My python environment is Python 3.7.12, and I just install required libraries and tools within this virtual environment. The scanner.py are run subsequently, generating above low detection rate of adaptors at the both ends of sequences. When I ran example fastq data , while the similar detecion rate occurred again: Total 7731 reads are processed. Time elapse: 0 : 0 : 8.77 Detecting rate: 11.77% Here I just used the example code. May I ask is it because the python environment or libraries/tools installed wrongly?
According to your trial of SRR21492154, I believe there's something wrong about conda environment.
The output '*minimap2.bam' are empty if I use this low detection rate 'processed.fastq.gz', which confused me a lot.
The processed.fastq.gz stores the reads after removal of TruSeq R1, cell barcode, UMI, TSO. Empty *.minimap2.bam files are resulting from library/tool/environment because the 2nd assigner is designed to filter out ambients. So there should be no way to generate empty bam files.
I would appreciate it if you could give me some advice to deal with it. Thanks a lot. Regards, Lily
Hi,
Many thanks for your quick reply!
Here I recreate a new conda environment and the python version as follows: $ python3 --version Python 3.9.19
Dependencies and required libraries/tools have already been installed within Python 3.9.19 version.
The code that I used to process example data is: 'python3 scanner.py -i example/fastq/ -t 2'
The 'scanner.log.txt' file has 100% detection rate now:
Total 7731 reads are processed.
Time elapse: 0 : 0 : 15.08
Detecting rate: 100.00%
Result counting:
Number of 3'-adaptor located on the read head region: 3854
Number of 3'-adaptor + polyT on the read head region: 3854
Number of 3'-adaptor located on the read tail region: 3877
Number of 3'-adaptor + polyT on the read tail region: 3877
Alignment counting:
Number of 3'-adaptor having no mismatch: 4677
Number of 3'-adaptor having mismatch at the last one position: 35
Number of 3'-adaptor having mismatch at all the last two position: 24
Number of 3'-adaptor having mismatch at all the last three position: 3
Number of 3'-adaptor having in/del at the last one position: 0
Number of 3'-adaptor having in/del at the last two position: 0
Number of 3'-adaptor having in/del at the last three position: 0
Number of rescued truncated 3'-adaptor on the read head region: 3
Number of rescued truncated 3'-adaptor on the read tail region: 879
I think the reason is that the low python version was used for the scanner step, resulting in a very low detecting rate of R1 and TSO adaptors. I will use SRR21492154 data for further analysis, and give you some feedback if it can be run smoothly.
Thank you very much.
Regards, Lily
Hi,
I just ran the test sample and the detection rate is 100%.
Starting time stamp: Mon, 25 Mar 2024 10:10:36
List of parameters:
Current working directory: ~/data/test/test_test
Input file name: scNanoGPS/example/fastq/example.fastq.gz
Output FastQ file name: test/processed.fastq.gz
Output barcode list name: test/barcode_list.tsv.gz
Log file name: test/scanner.log.txt
Parameters for pattern search:
Length of barcode: 16
Length of UMI: 12
5'-adaptor sequence: AAGCAGTGGTATCAACGCAGAGTACAT
3'-adaptor sequence: CTACACGACGCTCTTCCGATCT
PolyT sequence: TTTTTTTTTTTT
Scanning region length: 100
Penalty for dynamic programming:
Matching: 2
Mismatching: -3
Gap opening: -5
Gap extension: -2
Editing distance: 2
Parameters for computing:
Number of computer cores: 10
Number of reads per batch job: 1000
Minimal length of read: 200
Matching threshold: 0.7
Scoring threshold: 0.4
Debug mode switch: False
Total 7731 reads are processed.
Time elapse: 0 : 0 : 5.73
Detecting rate: 100.00%
Result counting:
Number of 3'-adaptor located on the read head region: 3854
Number of 3'-adaptor + polyT on the read head region: 3854
Number of 3'-adaptor located on the read tail region: 3877
Number of 3'-adaptor + polyT on the read tail region: 3877
Alignment counting:
Number of 3'-adaptor having no mismatch: 4677
Number of 3'-adaptor having mismatch at the last one position: 35
Number of 3'-adaptor having mismatch at all the last two position: 24
Number of 3'-adaptor having mismatch at all the last three position: 3
Number of 3'-adaptor having in/del at the last one position: 0
Number of 3'-adaptor having in/del at the last two position: 0
Number of 3'-adaptor having in/del at the last three position: 0
Number of rescued truncated 3'-adaptor on the read head region: 3
Number of rescued truncated 3'-adaptor on the read tail region: 879
Finish time stamp: Mon, 25 Mar 2024 10:10:42
Could you show me the libraries version by using "pip3 install -r scNanoGPS/requirements.txt" ? This command will install / show the version of required packages.
In addition, please try using debug mode with the following command
python3 scNanoGPS/scanner.py -i scNanoGPS/example/fastq/example.fastq.gz --debug_mode 1
and copy me the first 50 / 100 lines of the result to inspect the scanner result and identify the problem. Thank you.
Regards, Cheng-Kai
Hi,
A thousand of thanks for your response.
I know the reason is the old python version and corresponding libraries installed within this environment. Here is my environment output when installing required libraries:
$ pip3 install -r requirements.txt Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting biopython (from -r requirements.txt (line 1)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/30/b0/73fc250af13256c1c1db1edd17f2786fb02dda4c141d809b0d4159c6bbf1/biopython-1.83-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 1.7 MB/s eta 0:00:00 Collecting distance (from -r requirements.txt (line 2)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/5c/1a/883e47df323437aefa0d0a92ccfb38895d9416bd0b56262c2e46a47767b8/Distance-0.1.3.tar.gz (180 kB) Preparing metadata (setup.py) ... done Collecting liqa (from -r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/3b/e8/b0c456108472fa256afeafe93084132fc229ffb5923151d8577eb7ad2dad/liqa-1.3.4-py3-none-any.whl (32 kB) Collecting matplotlib (from -r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/35/82/ca05c3e3ec4a38eaf49a9bfa1a700658284ddaaa2e2523fa91fbb96d207a/matplotlib-3.8.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 11.8 MB/s eta 0:00:00 Collecting pandas (from -r requirements.txt (line 5)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1a/5e/71bb0eef0dc543f7516d9ddeca9ee8dc98207043784e3f7e6c08b4a6b3d9/pandas-2.2.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.0/13.0 MB 12.3 MB/s eta 0:00:00 Collecting pysam (from -r requirements.txt (line 6)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/35/22/3d01778c13f1103401313f1232c1c0596d97aaee21c1d60564640f3049bd/pysam-0.22.0.tar.gz (4.6 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Collecting seaborn (from -r requirements.txt (line 7)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/83/11/00d3c3dfc25ad54e731d91449895a79e4bf2384dc3ac01809010ba88f6d5/seaborn-0.13.2-py3-none-any.whl (294 kB) Requirement already satisfied: numpy in /nfshome/store02/users/c.c23047690/.conda/envs/scNanoGPS/lib/python3.9/site-packages (from biopython->-r requirements.txt (line 1)) (1.26.4) Collecting lifelines (from liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b3/98/868d6b60a6a8847a53bca3b15b0e057fb3ed6395e5852f0c0c55bbaaa928/lifelines-0.28.0-py3-none-any.whl (349 kB) Collecting contourpy>=1.0.1 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a9/ba/d8fd1380876f1e9114157606302e3644c85f6d116aeba354c212ee13edc7/contourpy-1.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (310 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 311.0/311.0 kB 18.7 MB/s eta 0:00:00 Collecting cycler>=0.10 (from matplotlib->-r requirements.txt (line 4)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl (8.3 kB) Collecting fonttools>=4.22.0 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/99/61/720e74663d9b0d54f60230cce977f11650481ae3c703d938ac80c5536828/fonttools-4.50.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.6/4.6 MB 14.7 MB/s eta 0:00:00 Collecting kiwisolver>=1.3.1 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c0/a8/841594f11d0b88d8aeb26991bc4dac38baa909dc58d0c4262a4f7893bcbf/kiwisolver-1.4.5-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 28.5 MB/s eta 0:00:00 Collecting packaging>=20.0 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/49/df/1fceb2f8900f8639e278b056416d49134fb8d84c5942ffaa01ad34782422/packaging-24.0-py3-none-any.whl (53 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.5/53.5 kB 1.4 MB/s eta 0:00:00 Collecting pillow>=8 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fd/98/35887712a640fe016817988141db021e1398b6d6620d29f8dceaffe72656/pillow-10.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 14.1 MB/s eta 0:00:00 Collecting pyparsing>=2.3.1 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9d/ea/6d76df31432a0e6fdf81681a895f009a4bb47b3c39036db3e1b528191d52/pyparsing-3.1.2-py3-none-any.whl (103 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.2/103.2 kB 7.1 MB/s eta 0:00:00 Collecting python-dateutil>=2.7 (from matplotlib->-r requirements.txt (line 4)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ec/57/56b9bcc3c9c6a792fcbaf139543cee77261f3651ca9da0c93f5c1221264b/python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB) Collecting importlib-resources>=3.2.0 (from matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/75/06/4df55e1b7b112d183f65db9503bff189e97179b256e1ea450a3c365241e0/importlib_resources-6.4.0-py3-none-any.whl (38 kB) Collecting pytz>=2020.1 (from pandas->-r requirements.txt (line 5)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9c/3d/a121f284241f08268b21359bd425f7d4825cffc5ac5cd0e1b3d82ffd2b10/pytz-2024.1-py2.py3-none-any.whl (505 kB) Collecting tzdata>=2022.7 (from pandas->-r requirements.txt (line 5)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/65/58/f9c9e6be752e9fcb8b6a0ee9fb87e6e7a1f6bcab2cdc73f02bb7ba91ada0/tzdata-2024.1-py2.py3-none-any.whl (345 kB) Collecting zipp>=3.1.0 (from importlib-resources>=3.2.0->matplotlib->-r requirements.txt (line 4)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c2/0a/ba9d0ee9536d3ef73a3448e931776e658b36f128d344e175bc32b092a8bf/zipp-3.18.1-py3-none-any.whl (8.2 kB) Collecting six>=1.5 (from python-dateutil>=2.7->matplotlib->-r requirements.txt (line 4)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB) Requirement already satisfied: scipy>=1.2.0 in /nfshome/store02/users/c.c23047690/.conda/envs/scNanoGPS/lib/python3.9/site-packages (from lifelines->liqa->-r requirements.txt (line 3)) (1.12.0) Collecting autograd>=1.5 (from lifelines->liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/81/70/d5c7c2a458b8be96495c8b1634c2155beab58cbe864b7a9a5c06c2e52520/autograd-1.6.2-py3-none-any.whl (49 kB) Collecting autograd-gamma>=0.3 (from lifelines->liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/85/ae/7f2031ea76140444b2453fa139041e5afd4a09fc5300cfefeb1103291f80/autograd-gamma-0.5.0.tar.gz (4.0 kB) Preparing metadata (setup.py) ... done Collecting formulaic>=0.2.2 (from lifelines->liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2c/09/7a9f95d35106d882f79ddabc2d33d8f2a262863f1f5d6fd00f46c5fc90aa/formulaic-1.0.1-py3-none-any.whl (94 kB) Collecting future>=0.15.2 (from autograd>=1.5->lifelines->liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/da/71/ae30dadffc90b9006d77af76b393cb9dfbfc9629f339fc1574a1c52e6806/future-1.0.0-py3-none-any.whl (491 kB) Collecting interface-meta>=1.2.0 (from formulaic>=0.2.2->lifelines->liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/02/3f/a6ec28c88e2d8e54d32598a1e0b5208a4baa72a8e7f6e241beab5731eb9d/interface_meta-1.3.0-py3-none-any.whl (14 kB) Collecting typing-extensions>=4.2.0 (from formulaic>=0.2.2->lifelines->liqa->-r requirements.txt (line 3)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f9/de/dc04a3ea60b22624b51c703a84bbe0184abcd1d0b9bc8074b5d6b7ab90bb/typing_extensions-4.10.0-py3-none-any.whl (33 kB) Collecting wrapt>=1.0 (from formulaic>=0.2.2->lifelines->liqa->-r requirements.txt (line 3)) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b1/e7/459a8a4f40f2fa65eb73cb3f339e6d152957932516d18d0e996c7ae2d7ae/wrapt-1.16.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 80.1/80.1 kB 1.2 MB/s eta 0:00:00 Building wheels for collected packages: distance, pysam, autograd-gamma Building wheel for distance (setup.py) ... done Created wheel for distance: filename=Distance-0.1.3-py3-none-any.whl size=16258 sha256=b5a69da8b6df4d60cfcb35b39ae9b45b9cc214b22ae9a80fb0ef034a6d89189b Stored in directory: /nfshome/store02/users/c.c23047690/.cache/pip/wheels/9d/b6/0e/d6ebc83ecc5ad23c74204af61f77817d4c0d3e792afd09fc3f Building wheel for pysam (pyproject.toml) ... done Created wheel for pysam: filename=pysam-0.22.0-cp39-cp39-linux_x86_64.whl size=8439949 sha256=ea6fc6646019408a683cf7d6d6e709f2036b7d00b71a69e91a182e01fc0f1914 Stored in directory: /nfshome/store02/users/c.c23047690/.cache/pip/wheels/69/8e/be/1c65c15f101a6931b3619f7ec781d1f630e85529c2ed26eb01 Building wheel for autograd-gamma (setup.py) ... done Created wheel for autograd-gamma: filename=autograd_gamma-0.5.0-py3-none-any.whl size=4031 sha256=11735b468603551ce876e437f8542618cf5a5cddda4dbd6328c7849d388d03b9 Stored in directory: /nfshome/store02/users/c.c23047690/.cache/pip/wheels/cb/60/73/b25b695bbaed121a41fd3550400f073e5020ffa4c9e7ce6b4e Successfully built distance pysam autograd-gamma Installing collected packages: pytz, distance, zipp, wrapt, tzdata, typing-extensions, six, pysam, pyparsing, pillow, packaging, kiwisolver, interface-meta, future, fonttools, cycler, contourpy, biopython, python-dateutil, importlib-resources, autograd, pandas, matplotlib, autograd-gamma, seaborn, formulaic, lifelines, liqa Successfully installed autograd-1.6.2 autograd-gamma-0.5.0 biopython-1.83 contourpy-1.2.0 cycler-0.12.1 distance-0.1.3 fonttools-4.50.0 formulaic-1.0.1 future-1.0.0 importlib-resources-6.4.0 interface-meta-1.3.0 kiwisolver-1.4.5 lifelines-0.28.0 liqa-1.3.4 matplotlib-3.8.3 packaging-24.0 pandas-2.2.1 pillow-10.2.0 pyparsing-3.1.2 pysam-0.22.0 python-dateutil-2.9.0.post0 pytz-2024.1 seaborn-0.13.2 six-1.16.0 typing-extensions-4.10.0 tzdata-2024.1 wrapt-1.16.0 zipp-3.18.1
The version of these libraries is higher than those in the GitHub page, which could support the analysis.
Thanks.
Regards, Lily
Hi,
The libraries you just installed are all in the same version with what I installed. Could you please further copy me the first 50 / 100 lines of the scanner debug_mode result ?
python3 scNanoGPS/scanner.py -i scNanoGPS/example/fastq/example.fastq.gz --debug_mode 1
Thanks.
Regards, Cheng-Kai
Hi,
The libraries you just installed are all in the same version with what I installed. Could you please further copy me the first 50 / 100 lines of the scanner debug_mode result ?
python3 scNanoGPS/scanner.py -i scNanoGPS/example/fastq/example.fastq.gz --debug_mode 1
Thanks.
Regards, Cheng-Kai
Hi,
Thank you very much. The 'scanner.log.txt' output under debug_mode as follows:
Starting time stamp: Mon, 25 Mar 2024 15:55:33
List of parameters: Current working directory: scNanoGPS Input file name: scNanoGPS/example/fastq/example.fastq.gz Output FastQ file name: scNanoGPS_res/processed.fastq.gz Output barcode list name: scNanoGPS_res/barcode_list.tsv.gz Log file name: scNanoGPS_res/scanner.log.txt
Parameters for pattern search: Length of barcode: 16 Length of UMI: 12 5'-adaptor sequence: AAGCAGTGGTATCAACGCAGAGTACAT 3'-adaptor sequence: CTACACGACGCTCTTCCGATCT PolyT sequence: TTTTTTTTTTTT Scanning region length: 100
Penalty for dynamic programming: Matching: 2 Mismatching: -3 Gap opening: -5 Gap extension: -2 Editing distance: 2
Parameters for computing: Number of computer cores: 2 Number of reads per batch job: 1000 Minimal length of read: 200 Matching threshold: 0.7 Scoring threshold: 0.4
Debug mode switch: 1
Total 7731 reads are processed. Time elapse: 0 : 0 : 19.07 Detecting rate: 100.00%
Result counting: Number of 3'-adaptor located on the read head region: 3854 Number of 3'-adaptor + polyT on the read head region: 3854 Number of 3'-adaptor located on the read tail region: 3877 Number of 3'-adaptor + polyT on the read tail region: 3877
Alignment counting: Number of 3'-adaptor having no mismatch: 4677
Number of 3'-adaptor having mismatch at the last one position: 35
Number of 3'-adaptor having mismatch at all the last two position: 24
Number of 3'-adaptor having mismatch at all the last three position: 3
Number of 3'-adaptor having in/del at the last one position: 0
Number of 3'-adaptor having in/del at the last two position: 0
Number of 3'-adaptor having in/del at the last three position: 0
Number of rescued truncated 3'-adaptor on the read head region: 3
Number of rescued truncated 3'-adaptor on the read tail region: 879
Finish time stamp: Mon, 25 Mar 2024 15:55:53
The detecting rate of adaptors is 100% now.
Regards, Lily
Hi,
Looks like it's working now. If it's available, please try running SRR21492154 again and see whether you can get 70%+ detection rate. Once you can get above 70% detection rate with SRR21492154, then you're good to run our pipeline with your sample. Thanks.
Regards, Cheng-Kai
Hi thanks for developing this tool!
Could you explain a bit more what the log file is saying? Here is an example of my logfile.
Specifically, what does the detection rate mean? Is it only 13.28% reads have valid adaptors and the reads that don't are discarded?
Thanks!