Closed rdacemel closed 2 years ago
Hi @rdacemel !
I've pushed a commit that hopefully resolves the first bug - please let me know if I missed any other lines that didn't work on your system.
Regarding the 50% 50% parse error - how were your individual libraries processed? Typically we have a breakdown of pair types which give a breakdown closer to 25%-25%-25%-25% not 50%-50%. It didn't find that expected breakdown, hence the warning.
The missing All by All can be ignored / I thought we had suppressed that printout. Which juicer_tools.jar version are you using?
Both of those messages are warnings, but the files should be fine.
Thanks so much for catching the bug and the feedback!
Hi! That was fast ;)
Hmm, I processed them with juicer with an early exit to run mega afterwards. Not sure what you refer to with breakdown of pairs but this is an example of an individual stats file in case it is helpful.
Read type: Paired End Sequenced Read Pairs: 15058574 No chimera found: 402463 (2.67%) One or both reads unmapped: 402463 (2.67%) 2 alignments: 14245238 (94.60%) 2 alignments (A...B): 13327908 (88.51%) 2 alignments (A1...A2B; A1B2...B1A2): 917330 (6.09%) 3 or more alignments: 410873 (2.73%) Ligation Motif Present: N/A Average insert size: 309.70 Total Unique: 14071075 (98.78%, 93.44%) Total Duplicates: 174163 (1.22%, 1.16%) Library Complexity Estimate*: 577,819,105 Intra-fragment Reads: N/A Below MAPQ Threshold: 3,518,125 (23.36% / 25.00%) Hi-C Contacts: 10,552,950 (70.08% / 75.00%) 3' Bias (Long Range): 50% - 50% Pair Type %(L-I-O-R): 25% - 25% - 25% - 25% L-I-O-R Convergence: 1233 Inter-chromosomal: 3,396,090 (22.55% / 24.14%) Intra-chromosomal: 7,156,860 (47.53% / 50.86%) Short Range (<20Kb): <500BP: 1,064,865 (7.07% / 7.57%) 500BP-5kB: 426,444 (2.83% / 3.03%) 5kB-20kB: 600,444 (3.99% / 4.27%) Long Range (>20Kb): 5,065,107 (33.64% / 36.00%)
I downloaded the last jar that I saw available (2.13.07),
Ah, thanks for sharing this! Let me investigate the merge-script and investigate this further. Did it still manage to build a merged inter.txt/inter_30.txt or were those missing in the created mega folder?
Also apologies, can you resubmit your request to join the Google Group? We've been having trouble filtering real vs fake users / if the reason field is empty, account requests are assumed to be spam.
It was indeed created:
Read type: Paired End Sequenced Read Pairs: 63338629 No chimera found: 1676126 (2.65%) One or both reads unmapped: 1676126 (2.65%) 2 alignments: 59935373 (94.63%) 2 alignments (A...B): 56054582 (88.50%) 2 alignments (A1...A2B; A1B2...B1A2): 3880791 (6.13%) 3 or more alignments: 1727130 (2.73%) Total Unique: 59191165 (93.45% / 98.76%) Total Duplicates: 744208 (1.17% / 1.24%) Below MAPQ Threshold: 20158790 (31.83% / 34.06%) Hi-C Contacts: 39032375 (61.62% / 65.94%) Pair Type %(L-I-O-R): 25% - 25% - 25% - 25% L-I-O-R Convergence: 1520 Inter-chromosomal: 11717002 (18.50% / 19.80%) Intra-chromosomal: 27315373 (43.13% / 46.15%) Short Range (<20Kb): <500BP: 4141117 (6.54% / 7.00%) 500BP-5kB: 1624538 (2.56% / 2.74%) 5kB-20kB: 2286728 (3.61% / 3.86%) Long Range (>20Kb): 19262990 (30.41% / 32.54%)
I will resubmit yes, I read the note about the bots just after clicking... No worries at all.
Hi there, I am getting the same error using mega>
java.lang.NumberFormatException: For input string: "65% - 35%"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at sfh.merger.StatsMerger.processLine(StatsMerger.java:41)
at sfh.merger.StatsMerger.parse(StatsMerger.java:16)
at sfh.StatsUtils.merge(StatsUtils.java:10)
at sfh.MergeStats.main(MergeStats.java:50)
java.lang.NumberFormatException: For input string: "60% - 40%"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at sfh.merger.StatsMerger.processLine(StatsMerger.java:41)
at sfh.merger.StatsMerger.parse(StatsMerger.java:16)
at sfh.StatsUtils.merge(StatsUtils.java:10)
at sfh.MergeStats.main(MergeStats.java:50)
java.lang.NumberFormatException: For input string: "64% - 36%"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at sfh.merger.StatsMerger.processLine(StatsMerger.java:41)
at sfh.merger.StatsMerger.parse(StatsMerger.java:16)
at sfh.StatsUtils.merge(StatsUtils.java:10)
at sfh.MergeStats.main(MergeStats.java:50)
java.lang.NumberFormatException: For input string: "60% - 40%"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at sfh.merger.StatsMerger.processLine(StatsMerger.java:41)
at sfh.merger.StatsMerger.parse(StatsMerger.java:16)
at sfh.StatsUtils.merge(StatsUtils.java:10)
at sfh.MergeStats.main(MergeStats.java:50)
(-: Finished creating top stats files.
sort: extra operand '/storage/brno2/home/pavlan/Barley_leaf_HiC/Barley_leaf_HiCrep1/aligned/merged1.txt.gz)' not allowed with -c
....the process stays in S status and does not seem to progress.
My replicates were processed with an older juicer release, so I had to rename the merged_nodups.txt to merged1.txt
Hi,
I am also getting the same message when running mega.sh
java.lang.NumberFormatException: For input string: "50% - 50%" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.base/java.lang.Long.parseLong(Long.java:692) at java.base/java.lang.Long.parseLong(Long.java:817) at sfh.merger.StatsMerger.processLine(StatsMerger.java:41) at sfh.merger.StatsMerger.parse(StatsMerger.java:16) at sfh.StatsUtils.merge(StatsUtils.java:10) at sfh.MergeStats.main(MergeStats.java:50) (-: Finished creating top stats files. (-: Finished sorting all files into a single merge.
Will it affect the following steps to sort individual merged1.txt or merged30.txt and to create the combined .hic file?
Many thanks!!
The .hic file should still build. The 3' bias line may be missing in the stats. We will work on a fix for this. But the overall .hic file should work without issue.
Are you sure this is an issue? Pretty much sure.
I've been playing around with new Juicer/mega and I think overall is a big improvement! I really like how intermediate files are managed now. However, some stuff required some fixing at least in my system.
java -Xmx2g -jar "${juiceDir}"/scripts/common/merge-stats.jar "$outputDir"/inter "${inter_names}"
forjava -Xmx2g -jar "${juiceDir}"/scripts/common/merge-stats.jar "$outputDir"/inter ${inter_names}
sort --parallel=40 -T "${tmpdir}" -m -k2,2d -k6,6d "${merged_names}" > "${outputDir}"/merged1.txt
forsort --parallel=40 -T "${tmpdir}" -m -k2,2d -k6,6d ${merged_names} > "${outputDir}"/merged1.txt
Are those two last messages expected, or should I worry? Best, Rafa.