Open mictadlo opened 5 years ago
Hi Michal,
Thanks for your e-mail, and for your interest in the LACHESIS software! The first thing I should mention is that LACHESIS is no longer being actively developed or maintained, as stated on the Github front page. I recommend you take a look at the Juicer software from the Aiden lab (https://github.com/theaidenlab), a more recently developed and actively maintained piece of code that serves roughly the same purpose. Also, if you want a research kit that will ensure high-quality Hi-C results, I suggest contacting the folks at Phase Genomics (https://phasegenomics.com/).
As for your concern about 19 chromosomes: As stated in the original paper, LACHESIS can predict roughly, but not precisely, the number of chromosomes in the assembly. Your assembly actually shows a pretty steep drop-off in size after the first 19 scaffolds (#0-#18). This suggests that LACHESIS has correctly picked up on intra-chromosomal signals; even in the absence of external information, you could have estimated roughly 19 chromosomes from the scaffold sizes. I suggest you interpret the 19 largest scaffolds as roughly equivalent to the 19 chromosomes, with some possible noisiness around the merge (cluster #18 in particular is borderline in size.) The other, smaller scaffolds are likely true chromosomal sequence that should have been merged into scaffolds #0-#18 but LACHESIS did not see a strong enough signal to make that merge. Note that the combined length of scaffolds #19-#114 is only 57 Mb.
-- Josh
Hi Josh, Thank you for your explanation. By any chance, do you know why none of the contigs have been ordered?
I'm not sure. The clusters are pretty large, so there should be enough signal to order them. Either there is a severe lack of Hi-C link density, or some of your assembly files might have been created incompletely. Try setting OVERWRITE_CLMS = 1.
Hi Josh,
I wish you a Happy New Year. Now, I created the BAM files with bwa mem -5SP [assembly.fasta] [fwd_hic.fastq] [rev_hic.fastq] | samblaster | samtools view -S -h -b -F 2316 > [aligned.bam]
as recommended by phasegenomics. This has reduced the amount of clusters from 115 to 20.
ReportChart!
Info about input assembly:
DE NOVO ASSEMBLY, with no reference genome (less validation available)
Species: benth
N contigs: 1512 Total length: 2774612304 N50: 4284592
N clusters (derived): 20
N non-singleton clusters: 20
N orderings found: 20
############################
# #
# CLUSTERING METRICS #
# #
############################
Number of contigs in clusters: 1495 (98.88% of all contigs)
Length of contigs in clusters: 2773948172 (99.98% of all sequence length)
+----------+-----------+-------------+
| CLUSTER | NUMBER OF | LENGTH OF |
| NUMBER | CONTIGS | CONTIGS |
+----------+-----------+-------------+
| 0 | 207 | 304822244 |
| 1 | 104 | 251236598 |
| 2 | 103 | 215915806 |
| 3 | 85 | 185990618 |
| 4 | 96 | 185821186 |
| 5 | 137 | 169943199 |
| 6 | 79 | 169694706 |
| 7 | 87 | 160635652 |
| 8 | 80 | 155356232 |
| 9 | 80 | 128553045 |
| 10 | 59 | 121698875 |
| 11 | 53 | 120471892 |
| 12 | 62 | 114055062 |
| 13 | 45 | 105996889 |
| 14 | 53 | 105077856 |
| 15 | 57 | 88736847 |
| 16 | 44 | 76993531 |
| 17 | 30 | 68241346 |
| 18 | 28 | 44391277 |
| 19 | 6 | 315311 |
+----------+-----------+-------------+
| TOTAL | 1495 | 2773948172 |
+----------+-----------+-------------+
Unfortunately, they are not ordered and oriented:
Number of contigs in orderings: 0 (0% of all contigs in clusters, 0% of all contigs)
Length of contigs in orderings: 0 (0% of all length in clusters, 0% of all sequence length)
Number of contigs in trunks: 0 (-nan% of contigs in orderings)
Length of contigs in trunks: 0 (-nan% of length in orderings)
Fraction of contigs in orderings with high orientation quality: 0 (-nan%), with length 0 (-nan%)
Fraction of contigs in trunks with high orientation quality: 0 (-nan%), with length 0 (-nan%)
I also tried OVERWRITE_CLMS = 1
without any success. Is it possible that this could be caused by the below files which were created outside the out
folder?
-rw-r--r-- 1 1032814217 root 24K Jan 2 03:39 QMg_NbQ4P_RN.fasta.counts_GATC.txt
-rw-r--r-- 1 1032814217 root 17K Jan 2 04:13 QMg_NbQ4P_RN.fasta.names
-rw-r--r-- 1 1032814217 root 102 Jan 2 05:46 heatmap.chrom_breaks.txt
-rw-r--r-- 1 1032814217 root 6 Jan 2 05:46 heatmap.txt
Thank you in advance,
Michal
@mictadlo I have same trouble with ordering? Did you have solved it ?
Hi,
I had the same issue, no contig ordering at all. I then found that my sam file was not ordered by read name.
Jorge
Hi, Running LACHESIS in the below way did not provide the expected chromosome numbers because I got 115 groups.
cat lachesis/REPORT.txt
provided:How am I able to the expected 19 chromosomes?
Thank you in advance,
Michal