Closed xiekunwhy closed 2 years ago
There are 3 approaches.
Assemble all samples together (as if all reads were from a single sample)
In this case, you don't need to merge any FASTQ files. Make sure you specify the files in the same order for -left
and -right
accordingly. For example, if you have two samples, sample1
and sample2
:
java -jar RNA-Bloom.jar -left sample1_1.fq.gz sample2_1.fq.gz -right sample1_2.fq.gz sample2_2.fq.gz -revcomp-right ...
Pooled assembly of your samples with the -pool
and -mergepool
options.
Each sample is assembled using the pooled de Bruijn graph and all assemblies are merged together.
java -jar RNA-Bloom.jar -pool READSLIST.txt -mergepool
Please refer to the README here: https://github.com/bcgsc/RNA-Bloom/tree/v1.4.3#b-assemble-single-cell-rna-seq-data-with-pooled-assembly-mode PS. It is very important to note that the format of the input file for version 1.4.3 is different from those on the master branch, which is for an upcoming version: https://github.com/bcgsc/RNA-Bloom#b-assemble-multi-sample-rna-seq-data-with-pooled-assembly-mode
Assemble each sample separately and merge the assemblies with BBMap's dedupe: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/dedupe-guide/
I recommend the 2nd method if you have a large memory server and don't have too many samples.
I am trying Pooled assembly.
I have an other questions about reference guided assemble: may I use stringtie(or stringtie merge) results as reference transcript? I have no really reference transcript since I am working on a denovo genome.
Yes, but the input needs to a FASTA file.
Hi,
If I have many bulk RNA samples (from different tissues or different samples), what is the best way to assemble these datas:
1) merge all fastq files by cat (zcat .R1.fq.gz|gzip -c > merge_1.fq.gz; zcat .R2.fq.gz|gzip -c > merge_2.fq.gz;) and then use rna-bloom to assemble merge fastq file.
2) use rna-bloom to assemble each sample seperately and merge the assemblies.
Best, Kun