CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
481 stars 190 forks source link

skip adding comment lines #553

Closed hukai916 closed 2 years ago

hukai916 commented 2 years ago

Hi developers,

Is there a way to skip adding comment lines on top of umi_tools extact command output? By default, I see the following comment lines being added on top, which breaks some of my downstream software:

# UMI-tools version: 1.1.2
# output generated by extract --bc-pattern=NNNNNNNN -I trim5_bc1001--bc1028.fastq.gz
# job started at Mon Aug 15 17:49:07 2022 on LRB6608ML63221 -- 49154031-8fea-4049-96e6-e8a2431b3c15
# pid: 41359, system: Darwin 21.6.0 Darwin Kernel Version 21.6.0: Sat Jun 18 17:07:25 PDT 2022; root:xnu-8020.140.41~1/RELEASE_X86_64 x86_64
# blacklist                               : None
# compresslevel                           : 6
# correct_umi_threshold                   : 0
# either_read                             : False
# either_read_resolve                     : discard
# error_correct_cell                      : False
# extract_method                          : string
# filter_cell_barcode                     : None
# filter_cell_barcodes                    : False
# filter_umi                              : None
# filtered_out                            : None
# filtered_out2                           : None
# ignore_suffix                           : False
# log2stderr                              : False
# loglevel                                : 1
# pattern                                 : NNNNNNNN
# pattern2                                : None
# prime3                                  : None
# quality_encoding                        : None
# quality_filter_mask                     : None
# quality_filter_threshold                : None
# random_seed                             : None
# read2_in                                : None
# read2_out                               : False
# read2_stdout                            : False
# reads_subset                            : None
# reconcile                               : False
# retain_umi                              : None
# short_help                              : None
# stderr                                  : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
# stdin                                   : <_io.TextIOWrapper name='trim5_bc1001--bc1028.fastq.gz' encoding='ascii'>
# stdlog                                  : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
# stdout                                  : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
# timeit_file                             : None
# timeit_header                           : None
# timeit_name                             : all
# tmpdir                                  : None
# umi_correct_log                         : None
# umi_whitelist                           : None
# umi_whitelist_paired                    : None
# whitelist                               : None
## 2022-08-15 17:49:07,920 INFO Starting barcode extraction
@m54328U_220703_220458/294/ccs_CGACAGTT
# pattern2                                : None
# prime3                                  : None
# quality_encoding                        : None
# quality_filter_mask                     : None
# quality_filter_threshold                : None
# random_seed                             : None
# read2_in                                : None
# read2_out                               : False
# read2_stdout                            : False
# reads_subset                            : None
# reconcile                               : False
# retain_umi                              : None
# short_help                              : None
# stderr                                  : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
# stdin                                   : <_io.TextIOWrapper name='trim5_bc1001--bc1028.fastq.gz' encoding='ascii'>
# stdlog                                  : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
# stdout                                  : <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
# timeit_file                             : None
# timeit_header                           : None
# timeit_name                             : all
# tmpdir                                  : None
# umi_correct_log                         : None
# umi_whitelist                           : None
# umi_whitelist_paired                    : None
# whitelist                               : None
## 2022-08-15 17:49:07,920 INFO Starting barcode extraction

Thanks and regards,

IanSudbery commented 2 years ago

There are several ways you can deal with separating the log information from the output.

Both the output and the log information can be (seperately) redirected to a file. Use -S/--stdout to redirect the output and -L/--logfile to redirect the logging information. So you can either redirect the output to file, and leave the log information where it is:

umi_tools  extract --bc-pattern=NNNNNNNN -I trim5_bc1001--bc1028.fastq.gz -S trim5_bc1001--bc1028.processed.fastq.gz

Or redirect the log and leave the output where it is:

umi_tools  extract --bc-pattern=NNNNNNNN -I trim5_bc1001--bc1028.fastq.gz -L trim5_bc1001--bc1028.log

or redirect both

umi_tools  extract --bc-pattern=NNNNNNNN -I trim5_bc1001--bc1028.fastq.gz -S trim5_bc1001--bc1028.processed.fastq.gz -L trim5_bc1001--bc1028.log

Personally, I like to redirect both.

Finally, it is possible to supress the log by setting the verbosity to 0 with -v 0

hukai916 commented 2 years ago

Thanks for the quick reply @IanSudbery, the solution works beautifully!