gaolabtools / scNanoGPS

Single cell Nanopore sequencing data for Genotype and Phenotype
Other
36 stars 1 forks source link

step4 curator: ValueError: file has no sequences defined (mode='rb') #21

Closed LilyLuyang closed 2 months ago

LilyLuyang commented 4 months ago

Hi author,

Thanks for this very useful tool!

I had this problem several times and also tried to solve it, but failed. Here is the issue:

Traceback (most recent call last): File "./scNanoGPS/curator.py", line 175, in seq_dict = curator_io.build_read_seq_dict(os.path.join(options.tmp_dir, fq_pref) + ".minimap2.bam") File "./scNanoGPS/curator_core/curator_io.py", line 193, in build_read_seq_dict bam_f = pysam.AlignmentFile(bam_name, "rb") File "pysam/calignmentfile.pyx", line 340, in pysam.calignmentfile.AlignmentFile.cinit File "pysam/calignmentfile.pyx", line 589, in pysam.calignmentfile.AlignmentFile._open ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_seq=True

The file might not be in SAM/BAM format after fast.gz were generated for minimap2 mapping. I added 'check_seq=True'. in 'bam_f = pysam.AlignmentFile(bam_name, "rb")' -->> 'bam_f = pysam.AlignmentFile(bam_name, "rb", check_seq=True)', but it is an unexpected keyword argument.

Therefore, could you please give me some suggestions to address minimap2 alignment step? I would appreciate it very much.

Thanks so much.

Best, Lily

shiauck commented 4 months ago

Hi,

The error message could be due empty single cell FastQ file which opened by pysam module. I think something wrong in scanner step. Could you check your scanner log file ? The detection rate must be higher than 70%, or there must be something wrong. Please provide more details. Thank you.

Regards, Cheng-kai

LilyLuyang commented 4 months ago

Hi,

The error message could be due empty single cell FastQ file which opened by pysam module. I think something wrong in scanner step. Could you check your scanner log file ? The detection rate must be higher than 70%, or there must be something wrong. Please provide more details. Thank you.

Regards, Cheng-kai

Hi Cheng-kai,

Thanks for your response. This issues occurred when I used the 'processed.fastq.gz' for the following steps, which showed the low detection rate of adaptors in the scanner step. The python verion is Python 3.7.12. Details are put into the first issue.

Regards, Lily

shiauck commented 4 months ago

Hi Lily,

Please try to install python 3.9.

I never try python 3.7 because it's out of date. Many tools/libraries are deprecated and no longer maintained on 3.7. The first scanning step is using biopython/pysam, which is quite version-sensitive. I think the processed fastq.gz you obtained might be truncated.

Regards, Cheng-Kai

myanofgintmd commented 4 months ago

Hello Cheng-Kai

Thank you very much for developped the scNanoGPS. I started using it for long read RNA seq project. I am sorry for interfupping the conversation between you and Lify, but I have same trouble and I think that it is good to joining the conversation.

When I run Curator after Sanner and Assigner for a fastq file, it has stopped with a message as below

Traceback (most recent call last):xing ...
  File "/home/minoruyano/opt/local/scNanoGPS/curator.py", line 175, in <module>
    seq_dict = curator_io.build_read_seq_dict(os.path.join(options.tmp_dir, fq_pref) + ".minimap2.bam")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/minoruyano/opt/local/scNanoGPS/curator_core/curator_io.py", line 193, in build_read_seq_dict
    bam_f = pysam.AlignmentFile(bam_name, "rb")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pysam/libcalignmentfile.pyx", line 747, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 996, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False

If I check files In tmp directory, there are some only-one-specific-CB related files as below, among all of true CB's fastq.gz files.

[theCB].minimap2.bam
[theCB].minimap2.bam.bai
[theCB].minimap2.log.txt
[theCB].sam
[theCB].unsorted.bam

I may have some trouble before running Curator while Scanner. A message below is output for each reads.

warnings.warn(
/home/minoruyano/anaconda3/lib/python3.11/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.

Additional information of the scNanoGPS run is as below. I got around 85-90% as detecting rate in Scanner. I run the all of scNanoGPS tools in python3.11

I hope that I would be able to fix this trouble and reach the final result of scNanoGPS for my fastq reads!!

Best regards, Minoru YANO Tokyo Medical and Dental University

shiauck commented 4 months ago

Hi Minoru,

Thank you for your feedback.

Could you provide your assigner log file ? Furthermore, could you please try to delete the whole temporary folder ("tmp" by default), and then re-run the curator again? The error message you saw is due to incomplete single cell fastq read file during the preparation step of curator. This could be due to wrong assigner result or some unknown system error during preparation of curator.

About the warning message you saw:

warnings.warn(
/home/minoruyano/anaconda3/lib/python3.11/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.

This is normal because the biopython is going to replace the old function "pairwise2" to "Align.PairwiseAligner". I'll update our pipeline in the next version. It's still working for now.

Hope this helps.

Regards, Cheng-Kai

myanofgintmd commented 3 months ago

Hi Cheng-Kai Thanks for the reply.

I deleted the "tmp" directory and re-run the scasnner and the assigner again. Below is the assigner log file.

Starting time stamp: Mon, 01 Apr 2024 10:06:00

        Analysis result directory:            scNanoGPS_res
        Cell barcode list file:               scNanoGPS_res/barcode_list.tsv.gz
        UMI counting file name:               scNanoGPS_res/CB_counting.tsv.gz
        Log10(UMI) distribution figure name:  scNanoGPS_res/CB_log10_dist.png
        Distance for merging cell barcodes:   2
        Merged cell barcodes distance matrix: scNanoGPS_res/CB_merged_dist.tsv.gz
        Merged cell barcodes file name:       scNanoGPS_res/CB_merged_list.tsv.gz
        Length of barcode:                    16
        Number of computer cores:             1

        Estimated cell number(plus extension):160

Finished time stamp: Mon, 01 Apr 2024 10:07:17

If you need any further information in order to investigate this issue, please let me know.

Best regards, Minoru

LilyLuyang commented 3 months ago

Hi Lily,

Please try to install python 3.9.

I never try python 3.7 because it's out of date. Many tools/libraries are deprecated and no longer maintained on 3.7. The first scanning step is using biopython/pysam, which is quite version-sensitive. I think the processed fastq.gz you obtained might be truncated.

Regards, Cheng-Kai

Hi Cheng-Kai,

Many thanks for your helpful response! I have resolved the python environment issue, and scanner step has been run smoothly with 78.47% detection rate of cell barcodes and UMI.

However, I have to say that another issue occurs now. After barcode splited into individual fastq files, they can't be mapped because no bam filesare generated. Here is the curator.log.txt:

/home/c.c23047690/scNanoGPS/curator.py -t 2 --fq_name /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.processed.fastq.gz --ref_genome /scratch/c.c23047690/reference/GRCh38.p13.genome.fa.gz --tmp_dir /scratch/c.c23047690/scNanoGPS/tmp --minimap2 /home/c.c23047690/.conda/envs/scNanoGPS/bin/minimap2 --samtools /home/c.c23047690/.conda/envs/scNanoGPS/bin/samtools --spoa /home/c.c23047690/.conda/envs/scNanoGPS/bin/spoa -d /scratch/c.c23047690/scNanoGPS/scNanoGPS_res -b /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.barcode_list.tsv.gz --CB_count /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_counting.tsv.gz --CB_list /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_list.tsv.gz --log SRR21492154.curator.log.txt

Separation of reads by cell barcode spent 6 : 38 : 19.71

Summary table of curation per cell barcode: Cellbarcode Raw records Low softclipping records High softclipping records Non-duplicated records Duplicated records Curated records Time spent

And the run error log txt is: FileNotFoundError: [Errno 2] could not open alignment file /scratch/c.c23047690/scNanoGPS/tmp/TTCGGTACAATGAATG.minimap2.bam: No such file or directory

I am confused why those fastq files can't be mapped into reference genome to generate minimap2.bam files for the following analysis. P.S.: example file can be run successfully in this curator step, and bam files are generated well. Could you please provide some advice about this issue? And now I run this step again. I would appreciate it very much!

Regards, Lily

shiauck commented 3 months ago

Hi Minoru,

How many cells are you target to ?

In the assigner log file, it detected 160 cells. Could you try provide "CB_log10_dist.png" under result folder ?

Thanks.

Regards, Cheng-Kai

Hi Cheng-Kai Thanks for the reply.

I deleted the "tmp" directory and re-run the scasnner and the assigner again. Below is the assigner log file.

Starting time stamp: Mon, 01 Apr 2024 10:06:00

        Analysis result directory:            scNanoGPS_res
        Cell barcode list file:               scNanoGPS_res/barcode_list.tsv.gz
        UMI counting file name:               scNanoGPS_res/CB_counting.tsv.gz
        Log10(UMI) distribution figure name:  scNanoGPS_res/CB_log10_dist.png
        Distance for merging cell barcodes:   2
        Merged cell barcodes distance matrix: scNanoGPS_res/CB_merged_dist.tsv.gz
        Merged cell barcodes file name:       scNanoGPS_res/CB_merged_list.tsv.gz
        Length of barcode:                    16
        Number of computer cores:             1

        Estimated cell number(plus extension):160

Finished time stamp: Mon, 01 Apr 2024 10:07:17

If you need any further information in order to investigate this issue, please let me know.

Best regards, Minoru

shiauck commented 3 months ago

Hi Lily,

How many cell are you target to ? Could you provide assigner log file and CB_log10_dist.png under result folder? Thanks.

Regards, Cheng-Kai

Hi Cheng-Kai,

Many thanks for your helpful response! I have resolved the python environment issue, and scanner step has been run smoothly with 78.47% detection rate of cell barcodes and UMI.

However, I have to say that another issue occurs now. After barcode splited into individual fastq files, they can't be mapped because no bam filesare generated. Here is the curator.log.txt:

/home/c.c23047690/scNanoGPS/curator.py -t 2 --fq_name /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.processed.fastq.gz --ref_genome /scratch/c.c23047690/reference/GRCh38.p13.genome.fa.gz --tmp_dir /scratch/c.c23047690/scNanoGPS/tmp --minimap2 /home/c.c23047690/.conda/envs/scNanoGPS/bin/minimap2 --samtools /home/c.c23047690/.conda/envs/scNanoGPS/bin/samtools --spoa /home/c.c23047690/.conda/envs/scNanoGPS/bin/spoa -d /scratch/c.c23047690/scNanoGPS/scNanoGPS_res -b /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.barcode_list.tsv.gz --CB_count /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_counting.tsv.gz --CB_list /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_list.tsv.gz --log SRR21492154.curator.log.txt

Separation of reads by cell barcode spent 6 : 38 : 19.71

Summary table of curation per cell barcode: Cellbarcode Raw records Low softclipping records High softclipping records Non-duplicated records Duplicated records Curated records Time spent

And the run error log txt is: FileNotFoundError: [Errno 2] could not open alignment file /scratch/c.c23047690/scNanoGPS/tmp/TTCGGTACAATGAATG.minimap2.bam: No such file or directory

I am confused why those fastq files can't be mapped into reference genome to generate minimap2.bam files for the following analysis. P.S.: example file can be run successfully in this curator step, and bam files are generated well. Could you please provide some advice about this issue? And now I run this step again. I would appreciate it very much!

Regards, Lily

LilyLuyang commented 3 months ago

Hi Lily,

How many cell are you target to ? Could you provide assigner log file and CB_log10_dist.png under result folder? Thanks.

Regards, Cheng-Kai

Hi Cheng-Kai, Many thanks for your helpful response! I have resolved the python environment issue, and scanner step has been run smoothly with 78.47% detection rate of cell barcodes and UMI. However, I have to say that another issue occurs now. After barcode splited into individual fastq files, they can't be mapped because no bam filesare generated. Here is the curator.log.txt: /home/c.c23047690/scNanoGPS/curator.py -t 2 --fq_name /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.processed.fastq.gz --ref_genome /scratch/c.c23047690/reference/GRCh38.p13.genome.fa.gz --tmp_dir /scratch/c.c23047690/scNanoGPS/tmp --minimap2 /home/c.c23047690/.conda/envs/scNanoGPS/bin/minimap2 --samtools /home/c.c23047690/.conda/envs/scNanoGPS/bin/samtools --spoa /home/c.c23047690/.conda/envs/scNanoGPS/bin/spoa -d /scratch/c.c23047690/scNanoGPS/scNanoGPS_res -b /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.barcode_list.tsv.gz --CB_count /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_counting.tsv.gz --CB_list /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_list.tsv.gz --log SRR21492154.curator.log.txt Separation of reads by cell barcode spent 6 : 38 : 19.71 Summary table of curation per cell barcode: Cellbarcode Raw records Low softclipping records High softclipping records Non-duplicated records Duplicated records Curated records Time spent And the run error log txt is: FileNotFoundError: [Errno 2] could not open alignment file /scratch/c.c23047690/scNanoGPS/tmp/TTCGGTACAATGAATG.minimap2.bam: No such file or directory I am confused why those fastq files can't be mapped into reference genome to generate minimap2.bam files for the following analysis. P.S.: example file can be run successfully in this curator step, and bam files are generated well. Could you please provide some advice about this issue? And now I run this step again. I would appreciate it very much! Regards, Lily

Hi,

Thanks for your timely reply.

Here is the assigner log file: Starting time stamp: Wed, 27 Mar 2024 19:20:16

    Analysis result directory:            /scratch/c.c23047690/scNanoGPS/scNanoGPS_res
    Cell barcode list file:               /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.barcode_list.tsv.gz
    UMI counting file name:               /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_counting.tsv.gz
    Log10(UMI) distribution figure name:  /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_log10_dist.png
    Distance for merging cell barcodes:   2
    Merged cell barcodes distance matrix: /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_dist.tsv.gz
    Merged cell barcodes file name:       /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_list.tsv.gz
    Length of barcode:                    16
    Number of computer cores:             2

Hi Lily,

How many cell are you target to ? Could you provide assigner log file and CB_log10_dist.png under result folder? Thanks.

Regards, Cheng-Kai

Hi Cheng-Kai, Many thanks for your helpful response! I have resolved the python environment issue, and scanner step has been run smoothly with 78.47% detection rate of cell barcodes and UMI. However, I have to say that another issue occurs now. After barcode splited into individual fastq files, they can't be mapped because no bam filesare generated. Here is the curator.log.txt: /home/c.c23047690/scNanoGPS/curator.py -t 2 --fq_name /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.processed.fastq.gz --ref_genome /scratch/c.c23047690/reference/GRCh38.p13.genome.fa.gz --tmp_dir /scratch/c.c23047690/scNanoGPS/tmp --minimap2 /home/c.c23047690/.conda/envs/scNanoGPS/bin/minimap2 --samtools /home/c.c23047690/.conda/envs/scNanoGPS/bin/samtools --spoa /home/c.c23047690/.conda/envs/scNanoGPS/bin/spoa -d /scratch/c.c23047690/scNanoGPS/scNanoGPS_res -b /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.barcode_list.tsv.gz --CB_count /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_counting.tsv.gz --CB_list /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_list.tsv.gz --log SRR21492154.curator.log.txt Separation of reads by cell barcode spent 6 : 38 : 19.71 Summary table of curation per cell barcode: Cellbarcode Raw records Low softclipping records High softclipping records Non-duplicated records Duplicated records Curated records Time spent And the run error log txt is: FileNotFoundError: [Errno 2] could not open alignment file /scratch/c.c23047690/scNanoGPS/tmp/TTCGGTACAATGAATG.minimap2.bam: No such file or directory I am confused why those fastq files can't be mapped into reference genome to generate minimap2.bam files for the following analysis. P.S.: example file can be run successfully in this curator step, and bam files are generated well. Could you please provide some advice about this issue? And now I run this step again. I would appreciate it very much! Regards, Lily

Hi, Thanks for your quickly response.

I didn't set '--forced_no 20000' here. Here is the assigner log file:

    Starting time stamp: Wed, 27 Mar 2024 19:20:16

    Analysis result directory:            /scratch/c.c23047690/scNanoGPS/scNanoGPS_res
    Cell barcode list file:               /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.barcode_list.tsv.gz
    UMI counting file name:               /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_counting.tsv.gz
    Log10(UMI) distribution figure name:  /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_log10_dist.png
    Distance for merging cell barcodes:   2
    Merged cell barcodes distance matrix: /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_dist.tsv.gz
    Merged cell barcodes file name:       /scratch/c.c23047690/scNanoGPS/scNanoGPS_res/SRR21492154.CB_merged_list.tsv.gz
    Length of barcode:                    16
    Number of computer cores:             2
LilyLuyang commented 3 months ago

SRR21492154 CB_log10_dist This is the CB_log10_dist figure. There are 20000 fastq files generated in curator step, but no bam files are generated.

Thanks a lot.

Best, Lily

myanofgintmd commented 3 months ago

Hi Cheng-Kai,

How many cells are you target to ? In the assigner log file, it detected 160 cells. Could you try provide "CB_log10_dist.png" under result folder ?

This is a kind of a pilot experiment (a pre-test experiment), using oligo pool as adapter primers and preparing the cDNA sample from mRNA library in a tube as a bulk reaction. The sample must contain 100 CB sequences. As we expect, the assigner distinguish the 100 CBs at the top and very low number of other CBs.

I clip the boader of the CBs from the file "CB_counting" as an image below.

image

The image below is the "CB_log10_dist.png".

CB_log10_dist

Does my setting of the CBs un-fit to the curator? I hope that the data above will lead us to solve the problem....

Best regards, Minoru

LilyLuyang commented 3 months ago

SRR21492154 CB_log10_dist This is the CB_log10_dist figure. There are 20000 fastq files generated in curator step, but no bam files are generated.

Thanks a lot.

Best, Lily

Hi Cheng-Kai,

I just wanna know whether this step3 leads to the error in the following step4 where no bam files are generated? In addition, it's strange that the files including 'SRR21492154.CB_counting.tsv.gz' and 'SRR21492154.CB_merged_list.tsv.gz' have contents similar with example files, but no bam files generated in step4.

I would appreciate it very much.

Best regards, Lily

myanofgintmd commented 3 months ago

Hi Cheng-Kai,

Order plot of the CB vs the number of UMI for top 500 CB is as below. Assigner may have worked correct because the order plot is reasonable for my sample with 100 CB.

image

Best regards, Minoru

LilyLuyang commented 3 months ago

Hi Cheng-Kai,

Order plot of the CB vs the number of UMI for top 500 CB is as below. Assigner may have worked correct because the order plot is reasonable for my sample with 100 CB.

image

Best regards, Minoru

Hi Minoru,

Sorry for disturbing your discussion with Cheng-Kai.

May I ask that does it work for the following curator step to generate curated.bam files since this issue hasn't been solved until now?

Thanks a lot.

Best regards, Lily

myanofgintmd commented 3 months ago

Hi Lily, Cheng-Kai,

I have a nice update today about the trouble. Lify, no problem. I am happy to share all information with you as well as Cheng-Kai,

-- Trouble was solved!! I solved the trouble by using proper reference genome file for curator process. I reloaded the file again from ensemble or NCBI and then the curator process generated result files. Original reference genome files, which I had been using, seemed to be broken or incorrect. File size of the original ones were different from the current ones.

I am sorry that the reason of the trouble is very basic matter and that I have been bothring you about it. Next, I should check the result files and the log information are fine. I will report here once I find that they are fine.

-- reference genome sequence files which I use now.

from ensembl
936478864 byte    Homo_sapiens.GRCh38.dna.toplevel.fa.gz
https://ftp.ensembl.org/pub/current_fasta/homo_sapiens/dna/
from NCBI
972898531 byte    GRCh38_latest_genomic.fna.gz
https://www.ncbi.nlm.nih.gov/genome/guide/human/

-- files, generated by the curator process under tmp directory

The process generate 6 files for each of the "sequence of the CB" [In my case, the number of the CB is 100]
[sequence of the CB].consensus.minimap2.log.txt  
[sequence of the CB].curated.minimap2.bam.bai  
[sequence of the CB].high_softclipping.bam
[sequence of the CB].curated.minimap2.bam        
[sequence of the CB].fastq.gz                  
[sequence of the CB].minimap2.log.txt

-- curator log file

Starting time stamp: Fri, 12 Apr 2024 17:36:47
List of parameters:
        Input fastQ file:           [xxxxxxx]/processed.fastq.gz
        Cell barcode list:          [xxxxxxx]/barcode_list.tsv.gz
        Cell barcode counting file: [xxxxxxx]/CB_counting.tsv.gz
        Merged cell barcode list:   [xxxxxxx]/CB_merged_list.tsv.gz
        Reference genome:           [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
        Minimap2 genome index:      None
        Output directory:           [xxxxxxx]
        Temporary directory:        tmp
        Number of computer cores:   2
        Log file name:              [xxxxxxx]/curator.log.txt
        LD for merging UMI:         2
        Includion region BED:       None
        Excludion region BED:       None
        Threshold for softclipping: 0.8
        Path of minimap2:           minimap2
        Path of samtools:           samtools
        Path spoa:                  spoa
        Skip curation step:         None

[xxxxxxx]/curator.py -t 24 --ref_genome [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz -d [xxxxxxx]

Separation of reads by cell barcode spent 0 : 6 : 34.64

Summary table of curation per cell barcode:
Cellbarcode     Raw records     Low softclipping records        High softclipping records       Non-duplicated records  Duplicated records      Curated records Time spent

[list of all the CB’s stat.  In my case there are 100 lows]

Curation process time spent: 4 : 40 : 9.13

please let me know if you need any further information about my situation

Best regards, Minoru

LilyLuyang commented 3 months ago

Hi Lily, Cheng-Kai,

I have a nice update today about the trouble. Lify, no problem. I am happy to share all information with you as well as Cheng-Kai,

-- Trouble was solved!! I solved the trouble by using proper reference genome file for curator process. I reloaded the file again from ensemble or NCBI and then the curator process generated result files. Original reference genome files, which I had been using, seemed to be broken or incorrect. File size of the original ones were different from the current ones.

I am sorry that the reason of the trouble is very basic matter and that I have been bothring you about it. Next, I should check the result files and the log information are fine. I will report here once I find that they are fine.

-- reference genome sequence files which I use now.

from ensembl
936478864 byte    Homo_sapiens.GRCh38.dna.toplevel.fa.gz
from NCBI
972898531 byte.    GRCh38_latest_genomic.fna.gz

-- files, generated by the curator process under tmp directory

The process generate 6 files for each of the "sequence of the CB" [In my case, the number of the CB is 100]
[sequence of the CB].consensus.minimap2.log.txt  
[sequence of the CB].curated.minimap2.bam.bai  
[sequence of the CB].high_softclipping.bam
[sequence of the CB].curated.minimap2.bam        
[sequence of the CB].fastq.gz                  
[sequence of the CB].minimap2.log.txt

-- curator log file

Starting time stamp: Fri, 12 Apr 2024 17:36:47
List of parameters:
        Input fastQ file:           [xxxxxxx]/processed.fastq.gz
        Cell barcode list:          [xxxxxxx]/barcode_list.tsv.gz
        Cell barcode counting file: [xxxxxxx]/CB_counting.tsv.gz
        Merged cell barcode list:   [xxxxxxx]/CB_merged_list.tsv.gz
        Reference genome:           [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
        Minimap2 genome index:      None
        Output directory:           [xxxxxxx]
        Temporary directory:        tmp
        Number of computer cores:   2
        Log file name:              [xxxxxxx]/curator.log.txt
        LD for merging UMI:         2
        Includion region BED:       None
        Excludion region BED:       None
        Threshold for softclipping: 0.8
        Path of minimap2:           minimap2
        Path of samtools:           samtools
        Path spoa:                  spoa
        Skip curation step:         None

[xxxxxxx]/curator.py -t 24 --ref_genome [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz -d [xxxxxxx]

Separation of reads by cell barcode spent 0 : 6 : 34.64

Summary table of curation per cell barcode:
Cellbarcode     Raw records     Low softclipping records        High softclipping records       Non-duplicated records  Duplicated records      Curated records Time spent

[list of all the CB’s stat.  In my case there are 100 lows]

Curation process time spent: 4 : 40 : 9.13

please let me know if you need any further information about my situation

Best regards, Minoru

Hi Minoru,

Many thanks for your kind response!

I will change the reference genome that you have recomended. Hope it will work well later. I'm very happy to communicate with you during the learning process.

Best regards, Luyang

shiauck commented 3 months ago

Hi Luyang,

Sorry for the late reply. Just get back from a week long conference.

From your CB_log10_dist figure, probably you are targeting to ten thousands or twenty thousands of cells when conducting experiment. In the figure, you can see the per cell read number is dropping down to tens (log10 scale on Y-axis is about 2) because of the limit yield rate of Nanopore platform. Such a low read number (UMI number) per cell could come out with empty bam file and produce the error you see.

When you targeting to ten thousand of twenty thousand of cells with insufficient read per cell, the assigner could fail to automatically assign optimal cell number because the slope dropping is too smooth. Please try to assign lesser cell number. In the end, you definitely don't need any cell having only ten or twenty read. With sufficient read number, you won't see any empty bam file generated in step 4. Hope this helps.

Regards, Cheng-Kai

SRR21492154 CB_log10_dist This is the CB_log10_dist figure. There are 20000 fastq files generated in curator step, but no bam files are generated. Thanks a lot. Best, Lily

Hi Cheng-Kai,

I just wanna know whether this step3 leads to the error in the following step4 where no bam files are generated? In addition, it's strange that the files including 'SRR21492154.CB_counting.tsv.gz' and 'SRR21492154.CB_merged_list.tsv.gz' have contents similar with example files, but no bam files generated in step4.

I would appreciate it very much.

Best regards, Lily

shiauck commented 3 months ago

Hi Minoru,

Sorry for the late reply. Just get back from a week long conference. Thank you for your update. I'm glad to help answering any question.

Regards, Cheng-Kai

Hi Lily, Cheng-Kai,

I have a nice update today about the trouble. Lify, no problem. I am happy to share all information with you as well as Cheng-Kai,

-- Trouble was solved!! I solved the trouble by using proper reference genome file for curator process. I reloaded the file again from ensemble or NCBI and then the curator process generated result files. Original reference genome files, which I had been using, seemed to be broken or incorrect. File size of the original ones were different from the current ones.

I am sorry that the reason of the trouble is very basic matter and that I have been bothring you about it. Next, I should check the result files and the log information are fine. I will report here once I find that they are fine.

-- reference genome sequence files which I use now.

from ensembl
936478864 byte    Homo_sapiens.GRCh38.dna.toplevel.fa.gz
https://ftp.ensembl.org/pub/current_fasta/homo_sapiens/dna/
from NCBI
972898531 byte    GRCh38_latest_genomic.fna.gz
https://www.ncbi.nlm.nih.gov/genome/guide/human/

-- files, generated by the curator process under tmp directory

The process generate 6 files for each of the "sequence of the CB" [In my case, the number of the CB is 100]
[sequence of the CB].consensus.minimap2.log.txt  
[sequence of the CB].curated.minimap2.bam.bai  
[sequence of the CB].high_softclipping.bam
[sequence of the CB].curated.minimap2.bam        
[sequence of the CB].fastq.gz                  
[sequence of the CB].minimap2.log.txt

-- curator log file

Starting time stamp: Fri, 12 Apr 2024 17:36:47
List of parameters:
        Input fastQ file:           [xxxxxxx]/processed.fastq.gz
        Cell barcode list:          [xxxxxxx]/barcode_list.tsv.gz
        Cell barcode counting file: [xxxxxxx]/CB_counting.tsv.gz
        Merged cell barcode list:   [xxxxxxx]/CB_merged_list.tsv.gz
        Reference genome:           [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
        Minimap2 genome index:      None
        Output directory:           [xxxxxxx]
        Temporary directory:        tmp
        Number of computer cores:   2
        Log file name:              [xxxxxxx]/curator.log.txt
        LD for merging UMI:         2
        Includion region BED:       None
        Excludion region BED:       None
        Threshold for softclipping: 0.8
        Path of minimap2:           minimap2
        Path of samtools:           samtools
        Path spoa:                  spoa
        Skip curation step:         None

[xxxxxxx]/curator.py -t 24 --ref_genome [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz -d [xxxxxxx]

Separation of reads by cell barcode spent 0 : 6 : 34.64

Summary table of curation per cell barcode:
Cellbarcode     Raw records     Low softclipping records        High softclipping records       Non-duplicated records  Duplicated records      Curated records Time spent

[list of all the CB’s stat.  In my case there are 100 lows]

Curation process time spent: 4 : 40 : 9.13

please let me know if you need any further information about my situation

Best regards, Minoru

LilyLuyang commented 3 months ago

Hi Lily, Cheng-Kai, I have a nice update today about the trouble. Lify, no problem. I am happy to share all information with you as well as Cheng-Kai, -- Trouble was solved!! I solved the trouble by using proper reference genome file for curator process. I reloaded the file again from ensemble or NCBI and then the curator process generated result files. Original reference genome files, which I had been using, seemed to be broken or incorrect. File size of the original ones were different from the current ones. I am sorry that the reason of the trouble is very basic matter and that I have been bothring you about it. Next, I should check the result files and the log information are fine. I will report here once I find that they are fine. -- reference genome sequence files which I use now.

from ensembl
936478864 byte    Homo_sapiens.GRCh38.dna.toplevel.fa.gz
from NCBI
972898531 byte.    GRCh38_latest_genomic.fna.gz

-- files, generated by the curator process under tmp directory

The process generate 6 files for each of the "sequence of the CB" [In my case, the number of the CB is 100]
[sequence of the CB].consensus.minimap2.log.txt  
[sequence of the CB].curated.minimap2.bam.bai  
[sequence of the CB].high_softclipping.bam
[sequence of the CB].curated.minimap2.bam        
[sequence of the CB].fastq.gz                  
[sequence of the CB].minimap2.log.txt

-- curator log file

Starting time stamp: Fri, 12 Apr 2024 17:36:47
List of parameters:
        Input fastQ file:           [xxxxxxx]/processed.fastq.gz
        Cell barcode list:          [xxxxxxx]/barcode_list.tsv.gz
        Cell barcode counting file: [xxxxxxx]/CB_counting.tsv.gz
        Merged cell barcode list:   [xxxxxxx]/CB_merged_list.tsv.gz
        Reference genome:           [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
        Minimap2 genome index:      None
        Output directory:           [xxxxxxx]
        Temporary directory:        tmp
        Number of computer cores:   2
        Log file name:              [xxxxxxx]/curator.log.txt
        LD for merging UMI:         2
        Includion region BED:       None
        Excludion region BED:       None
        Threshold for softclipping: 0.8
        Path of minimap2:           minimap2
        Path of samtools:           samtools
        Path spoa:                  spoa
        Skip curation step:         None

[xxxxxxx]/curator.py -t 24 --ref_genome [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz -d [xxxxxxx]

Separation of reads by cell barcode spent 0 : 6 : 34.64

Summary table of curation per cell barcode:
Cellbarcode     Raw records     Low softclipping records        High softclipping records       Non-duplicated records  Duplicated records      Curated records Time spent

[list of all the CB’s stat.  In my case there are 100 lows]

Curation process time spent: 4 : 40 : 9.13

please let me know if you need any further information about my situation Best regards, Minoru

Hi Minoru,

Many thanks for your kind response!

I will change the reference genome that you have recomended. Hope it will work well later. I'm very happy to communicate with you during the learning process.

Best regards, Luyang

Hi Minoru,

Updated reference genome works smoothly now, and 6 file are generated in curator step when I set 1,000 cell numbers in the assigner step. Now generating isoform file (step5 reporter) is running now.

Many thanks for your debug sharing!

Best regards, Lily

LilyLuyang commented 3 months ago

Hi Luyang,

Sorry for the late reply. Just get back from a week long conference.

From your CB_log10_dist figure, probably you are targeting to ten thousands or twenty thousands of cells when conducting experiment. In the figure, you can see the per cell read number is dropping down to tens (log10 scale on Y-axis is about 2) because of the limit yield rate of Nanopore platform. Such a low read number (UMI number) per cell could come out with empty bam file and produce the error you see.

When you targeting to ten thousand of twenty thousand of cells with insufficient read per cell, the assigner could fail to automatically assign optimal cell number because the slope dropping is too smooth. Please try to assign lesser cell number. In the end, you definitely don't need any cell having only ten or twenty read. With sufficient read number, you won't see any empty bam file generated in step 4. Hope this helps.

Regards, Cheng-Kai

SRR21492154 CB_log10_dist This is the CB_log10_dist figure. There are 20000 fastq files generated in curator step, but no bam files are generated. Thanks a lot. Best, Lily

Hi Cheng-Kai, I just wanna know whether this step3 leads to the error in the following step4 where no bam files are generated? In addition, it's strange that the files including 'SRR21492154.CB_counting.tsv.gz' and 'SRR21492154.CB_merged_list.tsv.gz' have contents similar with example files, but no bam files generated in step4. I would appreciate it very much. Best regards, Lily

Hi Cheng-Kai,

Many thanks for your insightful advice!

Bam files are generated in 'curator' when 1,000 cell numbers are assigned. The reason is that too less reads are not enough to do alignment if 20,000 cells are separated.

Expression patterns of gene and isoform are running now.

Best regards, Lily

LilyLuyang commented 3 months ago

Hi Lily, Cheng-Kai, I have a nice update today about the trouble. Lify, no problem. I am happy to share all information with you as well as Cheng-Kai, -- Trouble was solved!! I solved the trouble by using proper reference genome file for curator process. I reloaded the file again from ensemble or NCBI and then the curator process generated result files. Original reference genome files, which I had been using, seemed to be broken or incorrect. File size of the original ones were different from the current ones. I am sorry that the reason of the trouble is very basic matter and that I have been bothring you about it. Next, I should check the result files and the log information are fine. I will report here once I find that they are fine. -- reference genome sequence files which I use now.

from ensembl
936478864 byte    Homo_sapiens.GRCh38.dna.toplevel.fa.gz
from NCBI
972898531 byte.    GRCh38_latest_genomic.fna.gz

-- files, generated by the curator process under tmp directory

The process generate 6 files for each of the "sequence of the CB" [In my case, the number of the CB is 100]
[sequence of the CB].consensus.minimap2.log.txt  
[sequence of the CB].curated.minimap2.bam.bai  
[sequence of the CB].high_softclipping.bam
[sequence of the CB].curated.minimap2.bam        
[sequence of the CB].fastq.gz                  
[sequence of the CB].minimap2.log.txt

-- curator log file

Starting time stamp: Fri, 12 Apr 2024 17:36:47
List of parameters:
        Input fastQ file:           [xxxxxxx]/processed.fastq.gz
        Cell barcode list:          [xxxxxxx]/barcode_list.tsv.gz
        Cell barcode counting file: [xxxxxxx]/CB_counting.tsv.gz
        Merged cell barcode list:   [xxxxxxx]/CB_merged_list.tsv.gz
        Reference genome:           [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
        Minimap2 genome index:      None
        Output directory:           [xxxxxxx]
        Temporary directory:        tmp
        Number of computer cores:   2
        Log file name:              [xxxxxxx]/curator.log.txt
        LD for merging UMI:         2
        Includion region BED:       None
        Excludion region BED:       None
        Threshold for softclipping: 0.8
        Path of minimap2:           minimap2
        Path of samtools:           samtools
        Path spoa:                  spoa
        Skip curation step:         None

[xxxxxxx]/curator.py -t 24 --ref_genome [xxxxxxx]/Homo_sapiens.GRCh38.dna.toplevel.fa.gz -d [xxxxxxx]

Separation of reads by cell barcode spent 0 : 6 : 34.64

Summary table of curation per cell barcode:
Cellbarcode     Raw records     Low softclipping records        High softclipping records       Non-duplicated records  Duplicated records      Curated records Time spent

[list of all the CB’s stat.  In my case there are 100 lows]

Curation process time spent: 4 : 40 : 9.13

please let me know if you need any further information about my situation Best regards, Minoru

Hi Minoru, Many thanks for your kind response! I will change the reference genome that you have recomended. Hope it will work well later. I'm very happy to communicate with you during the learning process. Best regards, Luyang

Hi Minoru,

Updated reference genome works smoothly now, and 6 file are generated in curator step when I set 1,000 cell numbers in the assigner step. Now generating isoform file (step5 reporter) is running now.

Many thanks for your debug sharing!

Best regards, Lily

Hi Minoru,

Thanks for your advice! Now curator step can be run well (using GRCh38_latest_genomic.fna.gz from NCBI), while in the reporter isoform expression matrix step (using Homo_sapiens.GRCh38.100.gtf from ensemble), there is the error: Traceback (most recent call last): File "~/.conda/envs/scNanoGPS/lib/python3.9/site-packages/liqa_src/quantify.py", line 153, in for read in bamFilePysam.fetch(geneChr, geneStart, geneEnd): File "pysam/libcalignmentfile.pyx", line 1089, in pysam.libcalignmentfile.AlignmentFile.fetch File "pysam/libchtslib.pyx", line 683, in pysam.libchtslib.HTSFile.parse_region ValueError: invalid contig 19

there is no output in the 'matrix_isoform.tsv' file. I used this command to generate liqa index: 'liqa -task refgene -ref ./reference/Homo_sapiens.GRCh38.100.gtf -format gtf -out ./reference/Homo_sapiens.GRCh38.100.gtf.ensembl.liqa.refgene'

I guess whether the inconsistence of fast and gtf files in step4 and step5 leads to the error? Have you ever seen this problem before? I'm happy to communicate with you.

Thank you very much!

Best regards, Lily

myanofgintmd commented 2 months ago

Hi Lily,

I am happy to communicate with you about this tool, too!

Could you try again with the files below?

Homo_sapiens.GRCh38.dna.toplevel.fa.gz
from
https://ftp.ensembl.org/pub/current_fasta/homo_sapiens/dna/

Homo_sapiens.GRCh38.111.gtf.gz
from
https://ftp.ensembl.org/pub/current_gtf/homo_sapiens/

I succeeded in working with these reference files with my data as well as the official example data. Please let me know whether these reference files works in your project.

Best regards, Minoru

myanofgintmd commented 2 months ago

Hi Lily,

If the reference files work well and come to repoter_summary.py, please re-compress the reference genome file by gzip -d and bgzip. The bgzip format seems to be critical for the reporter_summary.py and the original Homo_sapiens.GRCh38.dna.toplevel.fa.gz file does not seem the bzip format.

$ ls -l Homo_sapiens.GRCh38.dna.toplevel.fa.gz
-rw-r--r-- 1 minoruyano fgin  936478864 Apr 12 15:47 Homo_sapiens.GRCh38.dna.toplevel.fa.gz

$ gzip -d cp_Homo_sapiens.GRCh38.dna.toplevel.fa.gz
$ bgzip cp_Homo_sapiens.GRCh38.dna.toplevel.fa

$ ls -l Homo_sapiens.GRCh38.dna.toplevel.fa.gz*
-rw-r--r-- 1 minoruyano fgin 950902225 Apr 26 14:35 Homo_sapiens.GRCh38.dna.toplevel.fa.gz
-rw-r--r-- 1 minoruyano fgin     26844 Apr 26 14:38 Homo_sapiens.GRCh38.dna.toplevel.fa.gz.fai
-rw-r--r-- 1 minoruyano fgin    817016 Apr 26 14:38 Homo_sapiens.GRCh38.dna.toplevel.fa.gz.gzi
LilyLuyang commented 2 months ago

Hi Lily, I am happy to communicate with you about this tool, too! Could you try again with the files below?

Homo_sapiens.GRCh38.dna.toplevel.fa.gz
from
https://ftp.ensembl.org/pub/current_fasta/homo_sapiens/dna/

Homo_sapiens.GRCh38.111.gtf.gz
from
https://ftp.ensembl.org/pub/current_gtf/homo_sapiens/

I succeeded in working with these reference files with my data as well as the official example data. Please let me know whether these reference files works in your project. Best regards, Minoru

Hi Minoru,

Many thanks for your response! Now I know why the error came. NCBI reference genome fasta file contains the accession number rather than chromosome name, while fasta files from Ensembl and GENCODE are chromosome names, and Liqa quantification needs to chromosome names in bam files.

Now I change the reference genome fasta file from NCBI to Gencode, and run them again. Hope it will work well.

fasta file from gencode: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_45/GRCh38.p14.genome.fa.gz

gtf file from gencode: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_45/gencode.v45.annotation.gtf.gz

Thanks for your very helpful suggestions and glad to discuss with you again!

Best regards, Lily

myanofgintmd commented 2 months ago

Hi Lily,

I tested the gencode reference data, which you mentioned in the last post, with the official example data. All the processes have done successfully and the summary.txt after reporter_summary.py was as below

Read yield:                  7731
Valid read number:           7731
Detecting rate:              100.0%

Median read length:          821.0
Mean read length:            984.92
Maximal read length:         5546
Median cell barcode quality: 23.31
Mean cell barcode quality:   22.74

Cell number:                 56
Raw reads per cell:          138.05
UMI counts:                  6920
Median UMI counts per cell:  122.5
Mean UMI counts per cell:    123.57
Median gene number:          65.0
Mean gene number:            60.55

Exonic:                      13.86%
Intronic:                    72.93%
Intergenic:                  13.21%

I just want to say that the reference genome file was not bgzip format, so that you might re-compress by gzip -k and bgzip for the file, "GRCh38.p14.genome.fa.gz".

If you can not complete the process with these reference data, I recommend you to do the test by the official example data. Please let me know if you have questions on the information above. I am happy to give you some advice on my experience.

Best regards, Minoru YANO

LilyLuyang commented 2 months ago

Hi Lily,

I tested the gencode reference data, which you mentioned in the last post, with the official example data. All the processes have done successfully and the summary.txt after reporter_summary.py was as below

Read yield:                  7731
Valid read number:           7731
Detecting rate:              100.0%

Median read length:          821.0
Mean read length:            984.92
Maximal read length:         5546
Median cell barcode quality: 23.31
Mean cell barcode quality:   22.74

Cell number:                 56
Raw reads per cell:          138.05
UMI counts:                  6920
Median UMI counts per cell:  122.5
Mean UMI counts per cell:    123.57
Median gene number:          65.0
Mean gene number:            60.55

Exonic:                      13.86%
Intronic:                    72.93%
Intergenic:                  13.21%

I just want to say that the reference genome file was not bgzip format, so that you might re-compress by gzip -k and bgzip for the file, "GRCh38.p14.genome.fa.gz".

If you can not complete the process with these reference data, I recommend you to do the test by the official example data. Please let me know if you have questions on the information above. I am happy to give you some advice on my experience.

Best regards, Minoru YANO

Hi Minoru,

Thousand of thanks for your really helpful messages!

The reference files from Ensembl and Gencode work well now, and 'isoform_tsv.gz' file is generated. It goes to next summary step. I totally agree with you that we should decompress and re-compress reference genome file "GRCh38.p14.genome.fa.gz" using gzip -d and bgzip as it is not the bgzip format.

A quick question: how could the "--qualimap" parameter be used for this summary step since it hints that no related file could be found for qualimap?

Thanks so much for your assistance on addressing above questions.

Best regards, Lily

myanofgintmd commented 2 months ago

Hi Lily,

I am happy to hear that you come to the summary step. Regarding the "--qualimap" parameter, you do not need to set the parameter is the application, "qualimap", works without set PATH to it.

If you check the environment with the following command and find that there is PATH to it, you might not need to set this parameter. which qualimap Or if you don't find the PATH to the qualimap by the command, please set the parameter while runing the reporter_summary.py, for example of my case , as below: --qualimap ~/opt/local/qualimap_v2.3/qualimap

Please let me know if you have any error or trouble by this issue. Minoru YANO Best regards,

LilyLuyang commented 2 months ago

Hi Lily,

I am happy to hear that you come to the summary step. Regarding the "--qualimap" parameter, you do not need to set the parameter is the application, "qualimap", works without set PATH to it.

If you check the environment with the following command and find that there is PATH to it, you might not need to set this parameter. which qualimap Or if you don't find the PATH to the qualimap by the command, please set the parameter while runing the reporter_summary.py, for example of my case , as below: --qualimap ~/opt/local/qualimap_v2.3/qualimap

Please let me know if you have any error or trouble by this issue. Minoru YANO Best regards,

Hi Minoru,

Many thanks for your really helpful advice!

I have finished running the final summary step already. By the way, I also encountered the issue: "out of memory of java when running --qualimap", but I found you have solved this problem using "--qualimap_param "--java-mem-size=300G" provided by Cheng-Kai. It's a very key point to generate final 'summary.txt' file.

The summary table is here: Read yield: 98363542 Valid read number: 77189395 Detecting rate: 78.47%

Median read length: 801.0 Mean read length: 937.13 Maximal read length: 637942 Median cell barcode quality: 21.69 Mean cell barcode quality: 20.93

Cell number: 944 Raw reads per cell: 104198.67 UMI counts: 14995267 Median UMI counts per cell: 10451.0 Mean UMI counts per cell: 15884.82 Median gene number: 2729.0 Mean gene number: 3129.75

Exonic: 17.63% Intronic: 64.16% Intergenic: 18.21%

I appreciate that your experience has helped me a lot to run this pipeline in the cluster. Thank you and Cheng-Kai very much! It's a wonderful experience for me to learn how to use this good tool.

Best regards, Lily

shiauck commented 2 months ago

I'm closing this issue. Please feel free to open it if you have any further questions. Thanks.