Closed lpantano closed 8 months ago
Hi,
as well you can try the new version of the report, you can download: https://github.com/lpantano/seqcluster/blob/master/seqcluster/templates/report.rmd
and modify root_path
to try it.
Yes, I modified the root_path, so the script picks up the file. Thanks, this new report looks nice for me. Actually I have 12 samples and I need to do 7 pairwise comparisons (differential expression). I run 2 samples just to familiarize myself with the pipeline. I will run remaining 10 samples and I will use exploratory analysis you provided. After that I will return to the problem of report generation, if you don't mind. Thank you so much for your time and support of this analysis!
nice, ok when you have more samples let me know, I am interesting in get that working in your case, because it should work nicely with these number of samples. Thanks for the help.
The new version I updated today should work, I think. At least for that point.
If you add the metadata to theYAML file, then you will have all the information in this report.
Hi again! Now I have all the samples and I'm working with the report https://github.com/lpantano/seqcluster/blob/master/seqcluster/templates/report.rmd
First, I've changed
#metadata = read.csv(metadata_fn, row.names="sample_id")
metadata = read.csv(metadata_fn)
metadata = metadata[,1] #sample_ids
to get proper sample ids.
Exploratory analysis. Size distribution. There are no trimming_stats file, trimming stat is in bcbio-nextgen-debug.log Correct files are
files = list.files(file.path(root_path),pattern = "trimming.fastq_size_stats",recursive = T)
Hi,
I think you had an older version when you ran that. You can change that part of the code, or you can remove the final folder, and re-start to create again all the files having the names that are expected now. In the final folder the trimming stats should be like:
ls repos/bcbio-nextgen/tests/srna_test/upload/miRQCa/
miRQCa-mirbase-ready.counts miRQCa-ready.trimming_stats qc tdrmapper
For the first part. metadata
should be a data.frame, with row.names
being the sample_id
column, and the columns the rest of the columns. I don't think what you have now will work for all report.
Can you paste here what you have if you do the actual code:
metadata = read.csv(metadata_fn, row.names="sample_id")
condition = names(metadata)[1]
metadata
It should be a data.frame. Thanks for the help.
Ok, I'll update and restart. About the dataframe
group
HI_3550_004_RPI1_R4215_R1 fake
HI_3550_004_RPI3_R4217_R1 fake
HI_3550_004_RPI5_R4219_R1 fake
HI_3550_005_RPI41_R4228_R1 fake
HI_3550_005_RPI43_R4230_R1 fake
HI_3550_005_RPI7_R4226_R1 fake
HI_3550_004_RPI2_R4216_R1 fake
HI_3550_004_RPI4_R4218_R1 fake
HI_3550_004_RPI6_R4220_R1 fake
HI_3550_005_RPI42_R4229_R1 fake
HI_3550_005_RPI44_R4231_R1 fake
HI_3550_005_RPI8_R4227_R1 fake
It is the dataframe but without "sample_id" header of the first row, it causes problems below.
can you tell me the exact line where you find the problem with that?
because if it is in the adapter plots, it could be because you don't get any files with that pattern. I am not using sample_id
in any other part of the code, so i am curios what line fails because of that.
thanks
Line 38, it is just reading of summary.csv - samples list
metadata_fn = list.files(file.path(root_path), pattern = "summary.csv$",recursive = T, full.names = T)
metadata = read.csv(metadata_fn, row.names="sample_id")
condition = names(metadata)[1]
design = metadata
formula = ~ condition # modify this to get your own formula, it should be a column in your metadata
isde=FALSE # turn this true to make DE ananlysis
Yes, line 38 is when you load the summary file.
I want to know when you have a problem in the code if you load that file as it is in the report, because you mentioned that produces error below if you don't change to:
#metadata = read.csv(metadata_fn, row.names="sample_id")
metadata = read.csv(metadata_fn)
metadata = metadata[,1] #sample_ids
So, I want to know where is below (when you have the issue) if you don't change that part of the code
Yes, you are right, the errors are because of missed files, these lines are ok. So I'm updating to
bcbio-nextgen: 0.9.8a0-py27_5 --> 0.9.8a0-py27_7
to generate files. Thanks!
let me know what happens! good luck!
It seems I have some problems with metadata
[2016-05-19T20:55Z] summarize variants
[2016-05-19T20:55Z] Timing: report
Traceback (most recent call last):
File "/home/naumenko/work/tools/bin/bcbio_nextgen.py", line 226, in <module>
main(**kwargs)
File "/home/naumenko/work/tools/bin/bcbio_nextgen.py", line 43, in main
run_main(**kwargs)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 43, in run_main
fc_dir, run_info_yaml)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 87, in _run_toplevel
for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 330, in smallrnaseqpipeline
srna_report(samples)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/srna/group.py", line 136, in report
group = _guess_group(info)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/srna/group.py", line 156, in _guess_group
return ",".join(info["metadata"].values())
TypeError: sequence item 1: expected string, NoneType found
Could you explain metadata usage in microRNA analysis? In a cancer pipeline I would use
#sample1
metadata:
batch: batch1
phenotype: tumor
#sample2
metadata:
batch: batch1
phenotype: normal
Would the same approach work for miRNA?
#sample1
metadata:
batch: batch1
phenotype: experiment
#sample2
metadata:
batch: batch1
phenotype: control
#sample3
metadata:
batch: batch2
phenotype: experiment
#sample2
metadata:
batch: batch2
phenotype: control
to compare sample1 vs sample2, sample3 vs sample2?
Hi,
it is weird. It's supposed to work like that. I will try to solve tomorrow. It is like some of the sample in you YAML file have empty value for some of the metadata values?
I will try to reproduce.
thanks
Hi,
I tried to reproduce but the only way it was if I had a line like this in the metadata:
metadata:
bath:
phenotpye: something
Like an empty value. Can you check that? I will fix it anyway just in case, but it is weird is getting a None value in any of the metadata.
Hi, in the config I had:
cat srna12.yaml | awk '{if($0~ "metadata"){print $0;getline;print $0;}}'
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: potato
metadata:
experiment: bean
metadata:
experiment: bean
metadata:
experiment: bean
All metadata keys have a value. YAML is valid. I suspect that I was running the previous version of the pipeline and the report was generated by new one. I will try batch-phenotype scheme. Also for some comparisons I'm using potato reference and for others - bean reference.
Now I have a problem with config file generation for multiple samples. My potato.template.yaml is
---
details:
- algorithm:
aligner: star
adapters: ["TGGAATTCTCGGGTGC"]
species: stu
tools_off: ["seqcluster"]
analysis: smallRNA-seq
description: R4215
files:
- /home/naumenko/work/mirna/input/HI.3550.004.RPI1.R4215_R1.fastq.gz
genome_build: soltub3
metadata:
batch: batch1
phenotype: experiment
fc_date: '2016-05-10'
fc_name: srna1
upload:
dir: ../final
my sample list srna12.csv is
samplename,description,batch,phenotype,sex,variant_regions
HI.3550.004.RPI3.R4217,R4217,batch1,experiment,,
HI.3550.004.RPI1.R4215,R4215,batch1,control,,
HI.3550.004.RPI4.R4218,R4218,batch2,experiment,,
HI.3550.004.RPI2.R4216,R4216,batch2,control,,
HI.3550.004.RPI6.R4220,R4220,batch3,experiment,,
HI.3550.004.RPI5.R4219,R4219,batch3,control,,
HI.3550.005.RPI41.R4228,R4228,batch4,experiment,,
HI.3550.005.RPI7.R4226,R4226,batch4;batch5,control,,
HI.3550.005.RPI43.R4230,R4230,batch5,experiment,,
HI.3550.005.RPI42.R4229,R4229,batch6,experiment,,
HI.3550.005.RPI8.R4227,R4227,batch6;batch7,control,,
HI.3550.005.RPI44.R4231,R4231,batch7,experiment,,
and I'm running
#!/bin/bash
KPATH=/home/naumenko/work/mirna/input
bcbio_nextgen.py -w template potato.template.yaml srna12.csv \
$KPATH/HI.3550.004.RPI3.R4217_R1.fastq.gz \
$KPATH/HI.3550.004.RPI1.R4215_R1.fastq.gz \
$KPATH/HI.3550.004.RPI4.R4218_R1.fastq.gz \
$KPATH/HI.3550.004.RPI2.R4216_R1.fastq.gz \
$KPATH/HI.3550.004.RPI6.R4220_R1.fastq.gz \
$KPATH/HI.3550.004.RPI5.R4219_R1.fastq.gz \
$KPATH/HI.3550.005.RPI41.R4228_R1.fastq.gz \
$KPATH/HI.3550.005.RPI7.R4226_R1.fastq.gz \
$KPATH/HI.3550.005.RPI43.R4230_R1.fastq.gz \
$KPATH/HI.3550.005.RPI42.R4229_R1.fastq.gz \
$KPATH/HI.3550.005.RPI8.R4227_R1.fastq.gz \
$KPATH/HI.3550.005.RPI44.R4231_R1.fastq.gz
Warnings
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI3.R4217_R1, HI.3550.004.RPI3.R4217_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI1.R4215_R1, HI.3550.004.RPI1.R4215_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI4.R4218_R1, HI.3550.004.RPI4.R4218_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI2.R4216_R1, HI.3550.004.RPI2.R4216_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI6.R4220_R1, HI.3550.004.RPI6.R4220_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI5.R4219_R1, HI.3550.004.RPI5.R4219_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI41.R4228_R1, HI.3550.005.RPI41.R4228_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI7.R4226_R1, HI.3550.005.RPI7.R4226_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI43.R4230_R1, HI.3550.005.RPI43.R4230_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI42.R4229_R1, HI.3550.005.RPI42.R4229_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI8.R4227_R1, HI.3550.005.RPI8.R4227_R1.fastq.gz
WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI44.R4231_R1, HI.3550.005.RPI44.R4231_R1.fastq.gz
It creates a config file with batch: batch1 and phenotype: experiment for all samples instead of batch1-7,experiment/control.
probably you need to add the _R1 to the samplename column. Some time that fixes the problem. We try to detect as much as posible, some time we miss this kind of difference.
Let me know.
On May 24, 2016, at 11:31 AM, Sergey Naumenko notifications@github.com wrote:
Now I have a problem with config file generation for multiple samples. My potato.template.yaml is
details:
- algorithm: aligner: star adapters: ["TGGAATTCTCGGGTGC"] species: stu tools_off: ["seqcluster"] analysis: smallRNA-seq description: HI.3550.004.RPI1.R4215_R1 files:
- /home/naumenko/work/mirna/input/HI.3550.004.RPI1.R4215_R1.fastq.gz genome_build: soltub3 metadata: batch: batch1 phenotype: experiment fc_date: '2016-05-10' fc_name: srna1 upload: dir: ../final my sample list srna12.csv is
samplename,description,batch,phenotype,sex,variant_regions HI.3550.004.RPI3.R4217,R4217,batch1,experiment,, HI.3550.004.RPI1.R4215,R4215,batch1,control,, HI.3550.004.RPI4.R4218,R4218,batch2,experiment,, HI.3550.004.RPI2.R4216,R4216,batch2,control,, HI.3550.004.RPI6.R4220,R4220,batch3,experiment,, HI.3550.004.RPI5.R4219,R4219,batch3,control,, HI.3550.005.RPI41.R4228,R4228,batch4,experiment,, HI.3550.005.RPI7.R4226,R4226,batch4;batch5,control,, HI.3550.005.RPI43.R4230,R4230,batch5,experiment,, HI.3550.005.RPI42.R4229,R4229,batch6,experiment,, HI.3550.005.RPI8.R4227,R4227,batch6;batch7,control,, HI.3550.005.RPI44.R4231,R4231,batch7,experiment,, and I'm running
!/bin/bash
KPATH=/home/naumenko/work/mirna/input
bcbio_nextgen.py -w template potato.template.yaml srna12.csv \ $KPATH/HI.3550.004.RPI3.R4217_R1.fastq.gz \ $KPATH/HI.3550.004.RPI1.R4215_R1.fastq.gz \ $KPATH/HI.3550.004.RPI4.R4218_R1.fastq.gz \ $KPATH/HI.3550.004.RPI2.R4216_R1.fastq.gz \ $KPATH/HI.3550.004.RPI6.R4220_R1.fastq.gz \ $KPATH/HI.3550.004.RPI5.R4219_R1.fastq.gz \ $KPATH/HI.3550.005.RPI41.R4228_R1.fastq.gz \ $KPATH/HI.3550.005.RPI7.R4226_R1.fastq.gz \ $KPATH/HI.3550.005.RPI43.R4230_R1.fastq.gz \ $KPATH/HI.3550.005.RPI42.R4229_R1.fastq.gz \ $KPATH/HI.3550.005.RPI8.R4227_R1.fastq.gz \ $KPATH/HI.3550.005.RPI44.R4231_R1.fastq.gz Warnings
WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI3.R4217_R1, HI.3550.004.RPI3.R4217_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI1.R4215_R1, HI.3550.004.RPI1.R4215_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI4.R4218_R1, HI.3550.004.RPI4.R4218_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI2.R4216_R1, HI.3550.004.RPI2.R4216_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI6.R4220_R1, HI.3550.004.RPI6.R4220_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.004.RPI5.R4219_R1, HI.3550.004.RPI5.R4219_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI41.R4228_R1, HI.3550.005.RPI41.R4228_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI7.R4226_R1, HI.3550.005.RPI7.R4226_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI43.R4230_R1, HI.3550.005.RPI43.R4230_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI42.R4229_R1, HI.3550.005.RPI42.R4229_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI8.R4227_R1, HI.3550.005.RPI8.R4227_R1.fastq.gz WARNING: Added minimal sample information: metadata not found for HI.3550.005.RPI44.R4231_R1, HI.3550.005.RPI44.R4231_R1.fastq.gz It creates a config file with batch: batch1 and phenotype: experiment for all samples.
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/lpantano/seqcluster/issues/16#issuecomment-221309489
Thanks in that way it works! It is not a typical dataset: usually I have two reads in a pair _R1, and _R2. There are just _R1's. SN
Hi, finally it crashes after multiqc:
[2016-05-25T15:43Z] [INFO ] multiqc : MultiQC complete
[2016-05-25T15:43Z] Timing: report
Traceback (most recent call last):
File "/home/naumenko/work/tools/bin/bcbio_nextgen.py", line 226, in <module>
main(**kwargs)
File "/home/naumenko/work/tools/bin/bcbio_nextgen.py", line 43, in main
run_main(**kwargs)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 43, in run_main
fc_dir, run_info_yaml)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 87, in _run_toplevel
for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 330, in smallrnaseqpipeline
srna_report(samples)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/srna/group.py", line 136, in report
group = _guess_group(info)
File "/home/naumenko/work/tools/bcbio/anaconda/lib/python2.7/site-packages/bcbio/srna/group.py", line 156, in _guess_group
return ",".join(info["metadata"].values())
TypeError: sequence item 0: expected string, list found
Some of my samples participate in two comparisons:
- algorithm:
metadata:
batch:
- batch4
- batch5
ops, sorry. I don't support for list there. The idea is to convert to columns the values in metadata, and the idea is to put one value per key. Any reason you would need two values in batch?
I'd like to compare sample1 vs sample 2, sample 3 vs sample 2.
well, in that case I will use something like
metadata: comparison1: group1 comparison2: group1 ... metadata: comparison1: group2 comparison2: none
and change those values to whatever you want to call the groups. But avoid for now multiple values for a variable inside metadata section.
As a head ups, I hope you have replicates when doing the differential expression analysis because DESeq2 won't work other wise. And in you case, probably you will need to modify the code, because you want to do more than one comparison.
cheers
Sorry, one more question about the report. in the counts_mirna.tsv I have 7 mln miRNAs for sample1:
cat counts_mirna.tsv | sed 1d | awk '{sum+=$2}END{print sum}'
7219907
The same quantity I see in the sample_folder/R4215-mirbase-ready.counts
cat R4215-mirbase-ready.counts | sed 1d | awk '{sum+=$3}END{print sum}'
7219907
However the report generates 639,411
obj <- IsomirDataSeqFromFiles(files, design = design, header = T, skip=0)
> sum((counts(obj))[,2])
[1] 639411
What is wrong?
Thanks, SN
yeah, that is weird. Can you get all the sums for all the columns in counts(obj) and then load the counts_mina.tsv and get the same and compare numbers for each sample?
you can do that with colSums command.
I am trying to reproduce, but I get same numbers, so we’ll need to work further to get into this.
sorry.
On Jun 2, 2016, at 3:04 PM, Sergey Naumenko notifications@github.com wrote:
Sorry, one more question about the report. in the counts_mirna.tsv I have 7 mln miRNAs for sample1:
cat counts_mirna.tsv | sed 1d | awk '{sum+=$2}END{print sum}' 7219907 The same quantity I see in the sample_folder/R4215-mirbase-ready.counts
cat R4215-mirbase-ready.counts | sed 1d | awk '{sum+=$3}END{print sum}' 7219907 However the report generates 639,411
obj <- IsomirDataSeqFromFiles(files, design = design, header = T, skip=0)
sum((counts(obj))[,2]) [1] 639411 What is wrong?
Thanks, SN
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/16#issuecomment-223390262, or mute the thread https://github.com/notifications/unsubscribe/ABi_HH1l8VM099bpmho6ViSgAeTzs_nzks5qHyksgaJpZM4Ihpd2.
Yes, something is wrong with the order, not with the numbers. maybe I should sort sample names in summary.csv
> colSums(counts_from_file)
R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230
7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770
> colSums(counts(obj))
R4217 R4215 R4218 R4216 R4220 R4219 R4228 R4226 R4230 R4229 R4227
7219907 639411 4992095 1088285 7347927 11789081 2776391 13565180 7415751 4508983 11772770
R4231
10224417
Sorry, it seems that it is my fault: I've mixed files from two pipeline runs: for 9 and 12 samples.
Same set, sorting matters:
> colSums(counts(obj))
R4217 R4215 R4218 R4216 R4220 R4219 R4228 R4226 R4230
7219907 639411 4992095 1088285 7347927 11789081 2776391 13565180 7415751
> colSums(counts_from_file)
R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230
7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770
yeah, the problem is the naming.
so in this command:
obj <- IsomirDataSeqFromFiles(files = files[rownames(design)], design = design , header = T, skip=0, quiet = FALSE)
rownames(design) is set to get the same order than design. The bottom line is that the vector files should be in the same order row.names in design matrix. Is that the problem?
On Jun 2, 2016, at 4:28 PM, Sergey Naumenko notifications@github.com wrote:
Same set, sorting matters:
colSums(counts(obj)) R4217 R4215 R4218 R4216 R4220 R4219 R4228 R4226 R4230 7219907 639411 4992095 1088285 7347927 11789081 2776391 13565180 7415751 colSums(counts_from_file) R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230 7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/16#issuecomment-223412175, or mute the thread https://github.com/notifications/unsubscribe/ABi_HDTPrVjqFHmrsXvdffQqfq-aFYafks5qHzz_gaJpZM4Ihpd2.
Thanks, finally I have it right. The script is searching for all mirbase-ready files, not for listed in the summary.csv only. I had some additional files there. Thanks a lot!
> colSums(counts_from_file)
R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230
7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770
> colSums(counts(obj))
R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230
7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770
Could you please look at A.thaliana issue? https://github.com/chapmanb/bcbio-nextgen/issues/1416
nice.
I will modify the genome_setup script in bcbio to be able to add that to a current genome.
On Jun 2, 2016, at 4:51 PM, Sergey Naumenko notifications@github.com wrote:
Thanks, finally I have it right. The script is searching for all mirbase-ready files, not for listed in the summary.csv only. I had some additional files there. Thanks a lot!
colSums(counts_from_file) R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230 7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770 colSums(counts(obj)) R4215 R4216 R4217 R4218 R4219 R4220 R4226 R4228 R4230 7219907 639411 4992095 1088285 7347927 11789081 2776391 7415751 11772770 Could you please look at A.thaliana issue? chapmanb/bcbio-nextgen#1416 https://github.com/chapmanb/bcbio-nextgen/issues/1416 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lpantano/seqcluster/issues/16#issuecomment-223418355, or mute the thread https://github.com/notifications/unsubscribe/ABi_HCAo-Ng6VqwwetUGJmAM25HZNxFWks5qH0JOgaJpZM4Ihpd2.
Hi @naumenko-sa,
for the problem with the report we can discuss here.
That report is a template that won't work in all analysis sadly. Can you tell me what would you like with your data? Since you only have 2 samples, probably you only are interested in a couple of figures only since we cannot do a lot with that number.
Some questions:
Does
root_path
point to the final folder?And what you get when you run
list.files(file.path(root_path),pattern = "trimming_stats",recursive = T)
inside R?As I said, little thing you will get from this report. The most important is the size distribution that you can see it as well open the HTML from the
multiqc
folder. I plan to migrate almost all QC figures to there during summer, so this will be better.If you give me more information about what you would like to have, I may be able to help.
cheers