HCGB-IGTP / XICRA

Small RNAseq pipeline for paired-end reads
MIT License
7 stars 3 forks source link

TypeError: can only concatenate str (not "tuple") to str #46

Closed Cesnl closed 10 months ago

Cesnl commented 1 year ago

when I run example,there was always a wrong. I dont know how to deal with it.

XICRA prep --input reads/ --output_folder test_XICRA

######################################################################

XICRA pipeline

Jose F. Sanchez & Lauro Sumoy

Copyright (C) 2019-2021 Lauro Sumoy Lab, IGTP, Spain

######################################################################

|==================================================| | Preparing samples | |==================================================|

--------- Starting Process --------- 10/14/2023, 18:43:57


JFsanchezherrero commented 1 year ago

Good morning, I am not sure what is going on here.

Can you produce the following execution using the flag --debug

XICRA prep --input reads/ --output_folder test_XICRA --debug

Let me know the output and the errors produced

Best regards

Cesnl commented 1 year ago

ATTENTION: I edit the message to better fit format

/home/XICRA/BMC_bioinformatics_paper/simulation/example# XICRA prep -i reads/ -o test_XICRA/ --debug

######################################################################
#                           XICRA pipeline                           #
#                   Jose F. Sanchez & Lauro Sumoy                    #
#        Copyright (C) 2019-2021 Lauro Sumoy Lab, IGTP, Spain        #
######################################################################

|==================================================|
|                Preparing samples                 |
|==================================================|

--------- Starting Process ---------
    10/19/2023, 19:53:29

+ Create output folder(s):
+ Generate a directory containing information within the project folder provided

--------------------------------------------------
+ Getting files from input folder... 
+ Mode: fastq.
+ Extension: 
[ fastq, fq, fastq.gz, fq.gz ]

+ Input folder exists
** DEBUG: sampleParser.get_files files
{'/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz'}

**DEBUG: sampleParser.get_files files list to check **
DO NOT PRINT THIS LIST: It could be very large...
{'/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz'} 

** DEBUG: select_samples
non_duplicate_names:
['.*']
** DEBUG: select_samples
non_duplicate_names:
{'.*'}
samples_prefix
{'.*'}
non_duplicate_samples
['/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz', '/home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz']
tmp dataframe
  sample  \
0     .*   
1     .*   
2     .*   
3     .*   
4     .*   
5     .*   
6     .*   
7     .*   
8     .*   
9     .*   

                                                                                                                  file  
0  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz  
1  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz  
2  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz  
3  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz  
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz  
5  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz  
6  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz  
7  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz  
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz  
9  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz  
  sample  \
0     .*   
1     .*   
2     .*   
3     .*   
4     .*   
5     .*   
6     .*   
7     .*   
8     .*   
9     .*   

                                                                                                                  file  
0  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz  
1  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz  
2  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz  
3  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz  
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz  
5  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz  
6  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz  
7  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz  
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz  
9  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz  
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz
*** DEBUG: Name: rep_5
*** DEBUG: Lane: 
*** DEBUG: Pair: R1
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz
*** DEBUG: Name: rep_3
*** DEBUG: Lane: 
*** DEBUG: Pair: R1
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz
*** DEBUG: Name: rep_2
*** DEBUG: Lane: 
*** DEBUG: Pair: R1
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz
*** DEBUG: Name: rep_4
*** DEBUG: Lane: 
*** DEBUG: Pair: R1
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz
*** DEBUG: Name: rep_1
*** DEBUG: Lane: 
*** DEBUG: Pair: R1
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz
*** DEBUG: Name: rep_4
*** DEBUG: Lane: 
*** DEBUG: Pair: R2
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz
*** DEBUG: Name: rep_2
*** DEBUG: Lane: 
*** DEBUG: Pair: R2
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz
*** DEBUG: Name: rep_5
*** DEBUG: Lane: 
*** DEBUG: Pair: R2
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz
*** DEBUG: Name: rep_1
*** DEBUG: Lane: 
*** DEBUG: Pair: R2
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
*** DEBUG: Search for sample names: 
*** DEBUG: File: /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz
*** DEBUG: Name: rep_3
*** DEBUG: Lane: 
*** DEBUG: Pair: R2
*** DEBUG: Lane file: 
*** DEBUG: Ext: fq
*** DEBUG: .gz
** DEBUG: select_samples
name_frame_samples:
                                                                                                                sample  \
0  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz   
1  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz   
2  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz   
3  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz   
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz   
5  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz   
6  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz   
7  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz   
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz   
9  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz   

                                                                                                dirname  \
0  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
1  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
2  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
3  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
5  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
6  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
7  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
9  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   

    name new_name  name_len lane read_pair lane_file ext   gz    tag  \
0  rep_5    rep_5         5             R1            fq  .gz  reads   
1  rep_3    rep_3         5             R1            fq  .gz  reads   
2  rep_2    rep_2         5             R1            fq  .gz  reads   
3  rep_4    rep_4         5             R1            fq  .gz  reads   
4  rep_1    rep_1         5             R1            fq  .gz  reads   
5  rep_4    rep_4         5             R2            fq  .gz  reads   
6  rep_2    rep_2         5             R2            fq  .gz  reads   
7  rep_5    rep_5         5             R2            fq  .gz  reads   
8  rep_1    rep_1         5             R2            fq  .gz  reads   
9  rep_3    rep_3         5             R2            fq  .gz  reads   

             file  
0  rep_5_R1.fq.gz  
1  rep_3_R1.fq.gz  
2  rep_2_R1.fq.gz  
3  rep_4_R1.fq.gz  
4  rep_1_R1.fq.gz  
5  rep_4_R2.fq.gz  
6  rep_2_R2.fq.gz  
7  rep_5_R2.fq.gz  
8  rep_1_R2.fq.gz  
9  rep_3_R2.fq.gz  
number_files:
10
total_samples:
{'rep_4', 'rep_5', 'rep_3', 'rep_1', 'rep_2'}
    10 files selected...
    5 samples selected...
    Paired-end mode selected...
** DEBUG: pd_samples_retrieved
                                                                                                                sample  \
0  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R1.fq.gz   
1  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R1.fq.gz   
2  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R1.fq.gz   
3  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R1.fq.gz   
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz   
5  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_4_R2.fq.gz   
6  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_2_R2.fq.gz   
7  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_5_R2.fq.gz   
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz   
9  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_3_R2.fq.gz   

                                                                                                dirname  \
0  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
1  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
2  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
3  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
5  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
6  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
7  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
9  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   

    name new_name  name_len lane read_pair lane_file ext   gz    tag  \
0  rep_5    rep_5         5             R1            fq  .gz  reads   
1  rep_3    rep_3         5             R1            fq  .gz  reads   
2  rep_2    rep_2         5             R1            fq  .gz  reads   
3  rep_4    rep_4         5             R1            fq  .gz  reads   
4  rep_1    rep_1         5             R1            fq  .gz  reads   
5  rep_4    rep_4         5             R2            fq  .gz  reads   
6  rep_2    rep_2         5             R2            fq  .gz  reads   
7  rep_5    rep_5         5             R2            fq  .gz  reads   
8  rep_1    rep_1         5             R2            fq  .gz  reads   
9  rep_3    rep_3         5             R2            fq  .gz  reads   

             file  
0  rep_5_R1.fq.gz  
1  rep_3_R1.fq.gz  
2  rep_2_R1.fq.gz  
3  rep_4_R1.fq.gz  
4  rep_1_R1.fq.gz  
5  rep_4_R2.fq.gz  
6  rep_2_R2.fq.gz  
7  rep_5_R2.fq.gz  
8  rep_1_R2.fq.gz  
9  rep_3_R2.fq.gz  
-------------------------
(Time spent: 0 h 0 min 0 s)
-------------------------
+ Dataframe grouby variable 'new_name'
                                                                                                                sample  \
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R1.fq.gz   
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads/rep_1_R2.fq.gz   

                                                                                                dirname  \
4  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   
8  /home/XICRA/BMC_bioinformatics_paper/simulation/example/reads   

    name new_name  name_len lane read_pair lane_file ext   gz    tag  \
4  rep_1    rep_1         5             R1            fq  .gz  reads   
8  rep_1    rep_1         5             R2            fq  .gz  reads   

             file        new_file  
4  rep_1_R1.fq.gz  rep_1_R1.fq.gz  
8  rep_1_R2.fq.gz  rep_1_R2.fq.gz

Traceback (most recent call last):
  File "/root/miniconda3/bin/XICRA", line 396, in <module>
    args.func(args)
  File "/root/miniconda3/lib/python3.9/site-packages/XICRA/modules/prep.py", line 173, in run_prep
    outdir_dict = functions.files_functions.outdir_project(outdir, options.project, pd_samples_retrieved, "raw", options.debug)    
  File "/root/miniconda3/lib/python3.9/site-packages/HCGB/functions/files_functions.py", line 60, in outdir_project
    sample_name_dir = create_subfolder(name, sample_dir)        
  File "/root/miniconda3/lib/python3.9/site-packages/HCGB/functions/files_functions.py", line 91, in create_subfolder
    subfolder_path = path + "/" + name
TypeError: can only concatenate str (not "tuple") to str

Thanks for your reply! The above is the debug output.

JFsanchezherrero commented 1 year ago

Hi there,

I am afraid it could be do the different python version.

I wonder how did you install the software.

Did you follow the guidelines specified in the README file? We tested in Python version 3.7.11 and we dumped the conda environment configuration in the environment.yml file. Installing from there should give you a working installation.

Let me know if you can fix it by doing this.

Try:

## get configuration file
wget https://raw.githubusercontent.com/HCGB-IGTP/XICRA/master/XICRA_pip/devel/conda/environment.yml
conda env create -f environment.yml

# activate conda
conda activate XICRA

## install latest XICRA code
pip install XICRA

## install missing dependencies
wget https://raw.githubusercontent.com/HCGB-IGTP/XICRA/master/XICRA_pip/XICRA/config/software/installer.sh
sh installer.sh

## test XICRA
cd subset2test/
sh test_subset.sh

Take into account you might need to remove the conda environmente previously generated or accommodate paths properly.

Regards