Closed bcbanderson closed 2 years ago
You can add a line similar to the existing line for the Arima cocktail, but you would have to add the extra possible junctions yourself. The relevant Arima line is
Arima) ligation="'(GAATAATC|GAATACTC|GAATAGTC|GAATATTC|GAATGATC|GACTAATC|GACTACTC|GACTAGTC|GACTATTC|GACTGATC|GAGTAATC|GAGTACTC|GAGTAGTC|GAGTATTC|GAGTGATC|GATCAATC|GATCACTC|GATCAGTC|GATCATTC|GATCGATC|GATTAATC|GATTACTC|GATTAGTC|GATTATTC|GATTGATC)'" ;;
That is already 25 possible junctions, coming from ^GATC, G^ANTC (two of the patterns in your cocktail). The 25 junctions arise because you have 5 cutting sequences (substituting for N in the second general pattern), and all ligations are possible: 5x5=25. To these you would have to add the possible junctions of the other cutters, C^TNAG and T^TAA, which give you an additional 5 cutting patterns for a total of 10 cutting sequence patterns. This means, if I am correct, that you have 10x10=100 possible junctions. You have some work to do here, but the syntax is as indicated (each possible ligation separated by a | symbol):
You can name this 100-sequence ligation pattern "myCocktail" or whatever you want, and then address it as such when you run juicer.sh, with the -s switch
myCocktail) ligation="'(xxxx|xxxx|your 100 ligation junctions...)'"
notice the double quote followed (or preceded) by a single quote at the start (or end).
I recommend that you read the following thread: https://groups.google.com/g/3d-genomics/c/1kgiGvi7vg8
This is correct; also note that one should appropriately account for the baseline rate, which will be quite high with so many possible sequences.
On Fri, Aug 20, 2021 at 4:42 PM edfajardo @.***> wrote:
You can add a line similar to the existing line for the Arima cocktail, but you would have to add the extra possible junctions yourself. The relevant Arima line is
Arima) ligation="'(GAATAATC|GAATACTC|GAATAGTC|GAATATTC|GAATGATC|GACTAATC|GACTACTC|GACTAGTC|GACTATTC|GACTGATC|GAGTAATC|GAGTACTC|GAGTAGTC|GAGTATTC|GAGTGATC|GATCAATC|GATCACTC|GATCAGTC|GATCATTC|GATCGATC|GATTAATC|GATTACTC|GATTAGTC|GATTATTC|GATTGATC)'" ;;
That is already 25 possible junctions, coming from ^GATC, G^ANT (two of the patterns in your cocktail). The 25 junctions arise because you have 5 cutting sequences (substituting for N in the second general pattern), and all ligations are possible: 5x5=25. To these you would have to add the possible junctions of the other cutters, C^TNAG and T^TAA, which give you and additional 5 cutting patterns for a total of 10 cutting sequence. This means, if I am correct, that you have 10x10=100 possible junctions. You have some work to do here, but the syntax is as indicated (each possible ligation separated by a | symbol):
You can name this 100-sequence ligation pattern "myCocktail" or whatever you want, and then address it as such when you run juicer.sh, with the -s switch
myCocktail) ligation="'(xxxx|xxxx|your 100 ligation junctions...)'"
notice the double quote followed (or preceded) by a single quote at the start (or end)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/aidenlab/juicer/issues/234#issuecomment-902946049, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAK2EW7XDBOIZVNMWGZOUHDT5242BANCNFSM5BNKW5RQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
-- Neva Cherniavsky Durand, Ph.D. | she, her, hers Assistant Professor | Molecular and Human Genetics Aiden Lab | Baylor College of Medicine www.aidenlab.org
Hey, I have a HiC data for which a cocktail of mixed restriction enzymes was used for cut at the following recognition sites: ^GATC, G^ANTC, C^TNAG, and T^TAA. How to code these patterns in the section "Set ligation junction based on restriction enzyme" in juicer pipeline? Thank you!
Set ligation junction based on restriction enzyme
case $site in HindIII) ligation="AAGCTAGCTT";; DpnII) ligation="GATCGATC";; MboI) ligation="GATCGATC";; NcoI) ligation="CCATGCATGG";; none) ligation="XXXX";; *) ligation="XXXX" echo "$site not listed as recognized enzyme. Using $site_file as site file" echo "Ligation junction is undefined" exit 100 esac