Closed Valentin-Bio closed 1 year ago
Hi,
regarding to this log: Is the program considering 186 samples or 186 files?
These 186 files will be merged into 93 samples after stage_two. You should see 93 columns in the final output. If not, then it might be a bug.
After running the stage_two , I picked the normalized_16S.type.txt file for further analysis, will it be a good idea to just sum the normalized counts founded on each paired end file for the same sample ?
If the files are not automatically merged you may consider first sum up the unnormalized count (_1 and _2 file) then divide it by the summation (_1 and _2 file) of 16S in metadata.txt.
HTH, Xi
Thanks, the problem was that some of the file names have an underscore previous the last underscore.
e.g
readA_1_1.fastq.gz readA_2_2.fastq.gz
thanks! I renamed the files and everything is good now
Hello, I have 93 metagenomic samples, I first ran the stage_one program on the folliwjg manner:
args_oap stage_one -I all_guys -o args -f fastq.gz -t 20
By reading the "(optional) Single/Paired end files" section on the main GitHub page, it says that to make args_oap to consider paired end data, I have to consider that my samples ends with _1 | _2 followed by the specified format (-f).
My paired end reads have the _1.fastq.gz and _2.fastq.gz suffix, considering that I specified
-f fastq.gz
, these files must be considered as two files for one samples (paired end data) but reading the stdout message, the program is considering them as separated samples:First question:
regarding to this log: Is the program considering 186 samples or 186 files?
Second question:
After running the
stage_two
, I picked the normalized_16S.type.txt file for further analysis, will it be a good idea to just sum the normalized counts founded on each paired end file for the same sample ?e.g.
CL_FP.BAC4J_CATAATAC-CGTTAGAA_L00M_2.fastq.gz counts + CL_FP.BAC4J_CATAATAC-CGTTAGAA_L00M_2.fastq.gz counts
Thanks for your time :)
bests,
Valentín.