Closed muffato closed 3 years ago
Hi Matthieu,
Can you provide the following additional information?
List of files and their size in the “SFs” directory.
Content of the parameter file
Program output log
Best,
Jaebum
On May 17, 2021, at 8:07 PM, Matthieu Muffato @.***> wrote:
Hello,
I want to test DESCHRAMBLER on simulated genomes (made as lists of genes), and I would like to know how I can make it not require the genome alignments (no sequences are simulated).
I have generated a Conserved.Segments file from my simulations. Like in sim.data.new.tar.gz (the simulations in your 2017 PNAS paper), each gene takes 1kb of space in the genome:
0 Gallus_gallus.16:415000-416000 - Monodelphis_domestica.4:5000-6000 - Canis_lupus_familiaris.15:651000-652000 + Homo_sapiens.13:648000-649000 + Mus_musculus.13:206000-207000 +
1 Gallus_gallus.2:869000-870000 + Monodelphis_domestica.8:4310000-4311000 - Canis_lupus_familiaris.28:270000-271000 - Homo_sapiens.23:2223000-2224000 + Mus_musculus.8:833000-834000 +
(...) I hacked Makefile.SFs to use his file instead, and I have changed the RESOLUTION parameter to 1kb, but it still tries to access the alignments to estimate the breakpoint distances, and the resulting reconstructions are 1 APCF per entry in Conserved.Segments - i.e. no reconstruction.
Could you please help ?
Regards, Matthieu
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV6QJQHUZUVMHX7LYXLTOD2HXANCNFSM45AHORSQ.
Sure, thank you for the quick reply !
1.
704 APCFs.1K/SFs
176953 APCFs.1K/SFs/Mus_musculus.joins
49 APCFs.1K/SFs/ingroup.txt
2821 APCFs.1K/SFs/Makefile
93 APCFs.1K/SFs/bpdist.txt
160 APCFs.1K/SFs/SFs_Gallus_gallus
2821 APCFs.1K/SFs/SFs_Gallus_gallus/Makefile
0 APCFs.1K/SFs/SFs_Gallus_gallus/Building.Blocks
590 APCFs.1K/SFs/SFs_Gallus_gallus/config.file
160 APCFs.1K/SFs/SFs_Canis_lupus_familiaris
2821 APCFs.1K/SFs/SFs_Canis_lupus_familiaris/Makefile
0 APCFs.1K/SFs/SFs_Canis_lupus_familiaris/Building.Blocks
590 APCFs.1K/SFs/SFs_Canis_lupus_familiaris/config.file
249059 APCFs.1K/SFs/Genomes.Order.new
0 APCFs.1K/SFs/block_consscores.txt
759908 APCFs.1K/SFs/adjacencies.prob
248981 APCFs.1K/SFs/Genomes.Order
2518304 APCFs.1K/SFs/Conserved.Segments
171277 APCFs.1K/SFs/Canis_lupus_familiaris.joins
141477 APCFs.1K/SFs/Gallus_gallus.joins
158792 APCFs.1K/SFs/Monodelphis_domestica.joins
362332 APCFs.1K/SFs/block_list.txt
184989 APCFs.1K/SFs/Homo_sapiens.joins
128 APCFs.1K/SFs/SFs_Monodelphis_domestica
2821 APCFs.1K/SFs/SFs_Monodelphis_domestica/Makefile
590 APCFs.1K/SFs/SFs_Monodelphis_domestica/config.file
36 APCFs.1K/SFs/outgroup.txt
587 APCFs.1K/SFs/config.file
128 APCFs.1K/SFs/SFs_Mus_musculus
2821 APCFs.1K/SFs/SFs_Mus_musculus/Makefile
590 APCFs.1K/SFs/SFs_Mus_musculus/config.file
2.
# Reference species
REFSPC=Homo_sapiens
# Output directory
OUTPUTDIR=APCFs.1K
#OUTPUTDIR=Out_RACA.anctest
# Block resolution (bp)
RESOLUTION=1000
#RESOLUTION=300000
# Newick tree file
# Refer to the sample file 'tree.txt'.
TREEFILE=tree.txt
# Minimum adjacency scores
MINADJSCR=0.0001
# Config and make files for syntenic fragment construction
# Refer to the sample files 'config.SFs' and 'Makefile.SFs'.
# You need to change settings in the sample configuration file (config.SFs) according to your data
# You don't need to change anything in the Makefile.SFs.
CONFIGSFSFILE=config.SFs
MAKESFSFILE=Makefile.SFs
The tree tree.txt
has no branch lengths, but I can provide some if that's required:
((((Homo_sapiens,Mus_musculus),Canis_lupus_familiaris)@,Monodelphis_domestica),Gallus_gallus);
3.
## Constructing syntenic fragments ##
cp -p /Users/mm49/workspace/simu2/Conserved.Segments Conserved.Segments
======== creating input files for inferring CARs ========
/Users/mm49/workspace/DESCHRAMBLER/code/makeBlocks/createGenomeFile config.file Conserved.Segments > Genomes.Order
/Users/mm49/workspace/DESCHRAMBLER/script/create_blocklist.pl Homo_sapiens APCFs.1K/SFs
/Users/mm49/workspace/DESCHRAMBLER/script/wrap_recon_apcf.pl /Users/mm49/workspace/DESCHRAMBLER/matt/tree.txt 1000 Homo_sapiens 0.0001 APCFs.1K/SFs APCFs.1K
Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Canis_lupus_familiaris/Homo_sapiens/net
Cannot open Canis_lupus_familiaris.processed.segs.
make: *** [Building.Blocks] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Mus_musculus/net
- processing Mus_musculus.raw.segs
Cannot open Mus_musculus.raw.segs.
make: *** [Grab.Data] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Gallus_gallus/Homo_sapiens/net
Cannot open Gallus_gallus.processed.segs.
make: *** [Building.Blocks] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Monodelphis_domestica/net
- processing Monodelphis_domestica.raw.segs
Cannot open Monodelphis_domestica.raw.segs.
make: *** [Grab.Data] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
TREE ((((Homo_sapiens,Mus_musculus),Canis_lupus_familiaris)@,Monodelphis_domestica),Gallus_gallus);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
14984 Homo_sapiens /Users/mm49/workspace/DESCHRAMBLER/matt/tree.txt APCFs.1K/SFs/bpdist.txt
Estimate JC parameter: 0.0001
readLeafGenomes: Homo_sapiens
readLeafGenomes: Mus_musculus
readLeafGenomes: Canis_lupus_familiaris
T=14984
Initializing Homo_sapiens (ingroup)
Initializing Mus_musculus (ingroup)
Initializing Canis_lupus_familiaris (ingroup)
Initializing Gallus_gallus (outgroup)
Initializing Monodelphis_domestica (outgroup)
Computing posterior probabilities ...
Died at /Users/mm49/workspace/DESCHRAMBLER/script/refine_adjprob.pl line 26, <F> line 207.
Minimum weight = 0.0001
Conservation score file = APCFs.1K/SFs/block_consscores.txt
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
- Totally 14983 APCFs
Some comments:
cp
is how I hacked my own Conserved.Segments
into the workflow`-enclosed commands to
print("..."); system("...")` so that I could see which command was run and their own outputThe first problem is the absence of branch lengths. It is required information. Please try again after adding the branch lengths.
On May 18, 2021, at 12:17 AM, Matthieu Muffato @.***> wrote:
Sure, thank you for the quick reply !
704 APCFs.1K/SFs 176953 APCFs.1K/SFs/Mus_musculus.joins 49 APCFs.1K/SFs/ingroup.txt 2821 APCFs.1K/SFs/Makefile 93 APCFs.1K/SFs/bpdist.txt 160 APCFs.1K/SFs/SFs_Gallus_gallus 2821 APCFs.1K/SFs/SFs_Gallus_gallus/Makefile 0 APCFs.1K/SFs/SFs_Gallus_gallus/Building.Blocks 590 APCFs.1K/SFs/SFs_Gallus_gallus/config.file 160 APCFs.1K/SFs/SFs_Canis_lupus_familiaris 2821 APCFs.1K/SFs/SFs_Canis_lupus_familiaris/Makefile 0 APCFs.1K/SFs/SFs_Canis_lupus_familiaris/Building.Blocks 590 APCFs.1K/SFs/SFs_Canis_lupus_familiaris/config.file 249059 APCFs.1K/SFs/Genomes.Order.new 0 APCFs.1K/SFs/block_consscores.txt 759908 APCFs.1K/SFs/adjacencies.prob 248981 APCFs.1K/SFs/Genomes.Order 2518304 APCFs.1K/SFs/Conserved.Segments 171277 APCFs.1K/SFs/Canis_lupus_familiaris.joins 141477 APCFs.1K/SFs/Gallus_gallus.joins 158792 APCFs.1K/SFs/Monodelphis_domestica.joins 362332 APCFs.1K/SFs/block_list.txt 184989 APCFs.1K/SFs/Homo_sapiens.joins 128 APCFs.1K/SFs/SFs_Monodelphis_domestica 2821 APCFs.1K/SFs/SFs_Monodelphis_domestica/Makefile 590 APCFs.1K/SFs/SFs_Monodelphis_domestica/config.file 36 APCFs.1K/SFs/outgroup.txt 587 APCFs.1K/SFs/config.file 128 APCFs.1K/SFs/SFs_Mus_musculus 2821 APCFs.1K/SFs/SFs_Mus_musculus/Makefile 590 APCFs.1K/SFs/SFs_Mus_musculus/config.file
Reference species
REFSPC=Homo_sapiens
Output directory
OUTPUTDIR=APCFs.1K
OUTPUTDIR=Out_RACA.anctest
Block resolution (bp)
RESOLUTION=1000
RESOLUTION=300000
Newick tree file
Refer to the sample file 'tree.txt'.
TREEFILE=tree.txt
Minimum adjacency scores
MINADJSCR=0.0001
Config and make files for syntenic fragment construction
Refer to the sample files 'config.SFs' and 'Makefile.SFs'.
You need to change settings in the sample configuration file (config.SFs) according to your data
You don't need to change anything in the Makefile.SFs.
CONFIGSFSFILE=config.SFs MAKESFSFILE=Makefile.SFs The tree tree.txt has no branch lengths, but I can provide some if that's required:
((((Homo_sapiens,Mus_musculus),Canis_lupus_familiaris)@,Monodelphis_domestica),Gallus_gallus);
Constructing syntenic fragments
cp -p /Users/mm49/workspace/simu2/Conserved.Segments Conserved.Segments ======== creating input files for inferring CARs ======== /Users/mm49/workspace/DESCHRAMBLER/code/makeBlocks/createGenomeFile config.file Conserved.Segments > Genomes.Order /Users/mm49/workspace/DESCHRAMBLER/script/create_blocklist.pl Homo_sapiens APCFs.1K/SFs /Users/mm49/workspace/DESCHRAMBLER/script/wrap_recon_apcf.pl /Users/mm49/workspace/DESCHRAMBLER/matt/tree.txt 1000 Homo_sapiens 0.0001 APCFs.1K/SFs APCFs.1K Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Canis_lupus_familiaris/Homo_sapiens/net Cannot open Canis_lupus_familiaris.processed.segs. make: *** [Building.Blocks] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Mus_musculus/net
- processing Mus_musculus.raw.segs Cannot open Mus_musculus.raw.segs. make: [Grab.Data] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Gallus_gallus/Homo_sapiens/net Cannot open Gallus_gallus.processed.segs. make: [Building.Blocks] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. Error - Could not open net dir /Users/mm49/workspace/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Monodelphis_domestica/net
- processing Monodelphis_domestica.raw.segs Cannot open Monodelphis_domestica.raw.segs. make: *** [Grab.Data] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. TREE ((((Homo_sapiens,Mus_musculus),Canis_lupus_familiaris)@,Monodelphis_domestica),Gallus_gallus);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. 14984 Homo_sapiens /Users/mm49/workspace/DESCHRAMBLER/matt/tree.txt APCFs.1K/SFs/bpdist.txt Estimate JC parameter: 0.0001 readLeafGenomes: Homo_sapiens readLeafGenomes: Mus_musculus readLeafGenomes: Canis_lupus_familiaris T=14984 Initializing Homo_sapiens (ingroup) Initializing Mus_musculus (ingroup) Initializing Canis_lupus_familiaris (ingroup) Initializing Gallus_gallus (outgroup) Initializing Monodelphis_domestica (outgroup) Computing posterior probabilities ... Died at /Users/mm49/workspace/DESCHRAMBLER/script/refine_adjprob.pl line 26,
line 207. Minimum weight = 0.0001 Conservation score file = APCFs.1K/SFs/block_consscores.txt UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
- Totally 14983 APCFs Some comments:
The first cp is how I hacked my own Conserved.Segments into the workflow I have transformed some of the ``-enclosed commands to print("..."); system("...") so that I could see which command was run and their own output — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-842409862, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV4GU3CAJSEDKPMXI2DTOEXSHANCNFSM45AHORSQ.
Thanks. Will do that. What is the unit of the lengths ? Does it matter if I say multiply all branches by 10 ?
You can use either wall clock time or expected number of substitutions per site. Using 10 for all branches? If you don’t know branch lengths, you can try it.
- 오전 5:52, Matthieu Muffato @.***> 작성:
Thanks. Will do that. What is the unit of the lengths ? Does it matter if I say multiply all branches by 10 ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi @jkimlab . It worked, thanks. But there is something strange in Ancestor.APCF
: some APCFs have two lines, cf APCF 49 below.
# APCF 48
-5946 5808 -7446 -5360 9524 2263 5072 11131 -6579 -2868 4454 8995 4311 9645 -2002 -13821 -11864 11373 7904 1085 8645 2255 -14836 4026 $
# APCF 49
13429 2530 4130 -3217 -4640 14546 12335 11054 -11708 12400 5330 -2984 -10633 3828 -12585 11574 988 -1885 $
6892 7721 -11547 -1546 -8358 9097 -13496 9663 12152 -12899 9780 -5633 -2899 12566 11257 11761 7787 9701 $
# APCF 50
4695 -14550 -11416 4817 10295 -796 -11621 -6104 -2225 -12957 12116 -152 -2164 5804 -3939 -8186 14545 $
# APCF 51
The documentation says:
Lines starting with "#": # APCF
Other lines: the order and orientation of SFs belonging to the APCF shown immediately above line. SFs are shown by using their identifiers (refer to the block_list.txt file).
Which I interpreted as "Here is the content of the 48th APCF, the 49th APCF, etc."
The program says there are 117 APCFs
$ ../DESCHRAMBLER.pl $PWD/params.txt
(...)
- Totally 117 APCFs
but when I check the file, I have this:
$ grep -c '^>' Ancestor.APCF
1
$ grep -c '^#' Ancestor.APCF
63
$ grep -c '\$$' Ancestor.APCF
117
So it looks like the indexing # APCF 49
is wrong and I should just ignore it ?
(apart from that, the results look fine: the ancestral genome found in SFs/block_list.txt
matches the expected, simulated, one very well)
The output looks strange.
Did the program run without any error message? Can you share again the output log and how you set the tree?
I think the program may not be completely modified for your case.
On May 23, 2021, at 12:16 AM, Matthieu Muffato @.***> wrote:
Hi @jkimlab https://github.com/jkimlab . It worked, thanks. But there is something strange in Ancestor.APCF: some APCFs have two lines, cf APCF 49 below.
APCF 48
-5946 5808 -7446 -5360 9524 2263 5072 11131 -6579 -2868 4454 8995 4311 9645 -2002 -13821 -11864 11373 7904 1085 8645 2255 -14836 4026 $
APCF 49
13429 2530 4130 -3217 -4640 14546 12335 11054 -11708 12400 5330 -2984 -10633 3828 -12585 11574 988 -1885 $ 6892 7721 -11547 -1546 -8358 9097 -13496 9663 12152 -12899 9780 -5633 -2899 12566 11257 11761 7787 9701 $
APCF 50
4695 -14550 -11416 4817 10295 -796 -11621 -6104 -2225 -12957 12116 -152 -2164 5804 -3939 -8186 14545 $
APCF 51
The documentation says:
Lines starting with "#": # APCF Other lines: the order and orientation of SFs belonging to the APCF shown immediately above line. SFs are shown by using their identifiers (refer to the block_list.txt file).
Which I interpreted as "Here is the content of the 48th APCF, the 49th APCF, etc."
The program says there are 117 APCFs
$ ../DESCHRAMBLER.pl $PWD/params.txt (...)
- Totally 117 APCFs but when I check the file, I have this:
$ grep -c '^>' Ancestor.APCF 1 $ grep -c '^#' Ancestor.APCF 63 $ grep -c '\$$' Ancestor.APCF 117 So it looks like the indexing # APCF 49 is wrong and I should just ignore it ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846422392, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PVYXTJTLL7NQ52BPS63TO7DFNANCNFSM45AHORSQ.
## Constructing syntenic fragments ##
cp -p /Users/mm49/workspace/agora/simulations/simu2/Conserved.Segments Conserved.Segments
======== creating input files for inferring CARs ========
/Users/mm49/workspace/agora/simulations/DESCHRAMBLER/code/makeBlocks/createGenomeFile config.file Conserved.Segments > Genomes.Order
/Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/create_blocklist.pl Homo_sapiens APCFs.1K/SFs
/Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/wrap_recon_apcf.pl /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt 1000 Homo_sapiens 0.0001 APCFs.1K/SFs APCFs.1K
Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Canis_lupus_familiaris/Homo_sapiens/net
Cannot open Canis_lupus_familiaris.processed.segs.
make: *** [Building.Blocks] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Mus_musculus/net
- processing Mus_musculus.raw.segs
Cannot open Mus_musculus.raw.segs.
make: *** [Grab.Data] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Gallus_gallus/Homo_sapiens/net
Cannot open Gallus_gallus.processed.segs.
make: *** [Building.Blocks] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Monodelphis_domestica/net
- processing Monodelphis_domestica.raw.segs
Cannot open Monodelphis_domestica.raw.segs.
make: *** [Grab.Data] Error 1
readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61.
TREE ((((Homo_sapiens:1.5611,Mus_musculus:2.1014):1.5229,Canis_lupus_familiaris:1.1084)@:1.6692,Monodelphis_domestica:1.6262):1.1361,Gallus_gallus:1.1042);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
14984 Homo_sapiens /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt APCFs.1K/SFs/bpdist.txt
Estimate JC parameter: 0.0001
readLeafGenomes: Homo_sapiens
readLeafGenomes: Mus_musculus
readLeafGenomes: Canis_lupus_familiaris
T=14984
Initializing Homo_sapiens (ingroup)
Initializing Mus_musculus (ingroup)
Initializing Canis_lupus_familiaris (ingroup)
Initializing Gallus_gallus (outgroup)
Initializing Monodelphis_domestica (outgroup)
Computing posterior probabilities ...
Minimum weight = 0.0001
Conservation score file = APCFs.1K/SFs/block_consscores.txt
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
Use of uninitialized value $len in addition (+) at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/sort_apcfs.pl line 43, <F> line 23.
- Totally 117 APCFs
The program didn’t run correctly. The first error that I found is the following. readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. The script tried to open the “Genomes.Order” file, but it failed. Please check if the file is in the right place.
On May 23, 2021, at 12:04 PM, Matthieu Muffato @.***> wrote:
Constructing syntenic fragments
cp -p /Users/mm49/workspace/agora/simulations/simu2/Conserved.Segments Conserved.Segments ======== creating input files for inferring CARs ======== /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/code/makeBlocks/createGenomeFile config.file Conserved.Segments > Genomes.Order /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/create_blocklist.pl Homo_sapiens APCFs.1K/SFs /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/wrap_recon_apcf.pl /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt 1000 Homo_sapiens 0.0001 APCFs.1K/SFs APCFs.1K Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Canis_lupus_familiaris/Homo_sapiens/net Cannot open Canis_lupus_familiaris.processed.segs. make: *** [Building.Blocks] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Mus_musculus/net
- processing Mus_musculus.raw.segs Cannot open Mus_musculus.raw.segs. make: [Grab.Data] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Gallus_gallus/Homo_sapiens/net Cannot open Gallus_gallus.processed.segs. make: [Building.Blocks] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. Error - Could not open net dir /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/examples/chainNet/Homo_sapiens/Monodelphis_domestica/net
- processing Monodelphis_domestica.raw.segs Cannot open Monodelphis_domestica.raw.segs. make: *** [Grab.Data] Error 1 readline() on closed filehandle F at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/estimate_bpdist4.pl line 61. TREE ((((Homo_sapiens:1.5611,Mus_musculus:2.1014):1.5229,Canis_lupus_familiaris:1.1084)@:1.6692,Monodelphis_domestica:1.6262):1.1361,Gallus_gallus:1.1042);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. 14984 Homo_sapiens /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt APCFs.1K/SFs/bpdist.txt Estimate JC parameter: 0.0001 readLeafGenomes: Homo_sapiens readLeafGenomes: Mus_musculus readLeafGenomes: Canis_lupus_familiaris T=14984 Initializing Homo_sapiens (ingroup) Initializing Mus_musculus (ingroup) Initializing Canis_lupus_familiaris (ingroup) Initializing Gallus_gallus (outgroup) Initializing Monodelphis_domestica (outgroup) Computing posterior probabilities ... Minimum weight = 0.0001 Conservation score file = APCFs.1K/SFs/block_consscores.txt UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. Use of uninitialized value $len in addition (+) at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/sort_apcfs.pl line 43,
line 23.
- Totally 117 APCFs — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846494019, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV7GOF6HZXJCQBL7FZDTPBWFDANCNFSM45AHORSQ.
Yes, because I don't have actual sequences. I'm trying to run it on simulated data like in your paper. All I have is a set of orthologous regions between all my genomes.
Yes, I know. But the “Genomes.Order” file can be generated from the “Conserved.Segments” file that you created manually. Check step 5 in the following file.
https://github.com/jkimlab/DESCHRAMBLER/blob/master/examples/Makefile.SFs https://github.com/jkimlab/DESCHRAMBLER/blob/master/examples/Makefile.SFs
Without the “Genomes.Order” file, the program can’t make correct output.
On May 23, 2021, at 9:26 PM, Matthieu Muffato @.***> wrote:
Yes, because I don't have actual sequences. I'm trying to run it on simulated data like in your paper. All I have is a set of orthologous regions between all my genomes.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846555135, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV4TKCZ7Z7XJTPGEYYDTPDX7FANCNFSM45AHORSQ.
Genomes.Order
is there and has this sort of data:
>Canis_lupus_familiaris 39
# 15
-6087 9829 -9582 -3175 -13854 3312 6454
All ingroups species are in there, and they seem complete
Then, I don’t know the reason of the error by just checking the information you provided. If you can send the compressed tar ball of your working directory (including the parameter file) to me, I can check.
On May 23, 2021, at 9:51 PM, Matthieu Muffato @.***> wrote:
Genomes.Order is there and has this sort of data:
Canis_lupus_familiaris 39
15
-6087 9829 -9582 -3175 -13854 3312 6454 All ingroups species are in there, and they seem complete
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846558399, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PVYMDUCBUY6MBPSPDILTPD23TANCNFSM45AHORSQ.
OK. I found the reason.
In the “SFs” directory, you can see subdirectories with a name “SFs_
So, to run the program correctly, you need to additionally create the above two files in each of the subdirectories. The error occurred because the “Genomes.Order” files do not exist in those subdirectories.
Please try and let me know how it goes.
On May 23, 2021, at 10:33 PM, Matthieu Muffato @.***> wrote:
Sure. Here it is: simu2.zip https://github.com/jkimlab/DESCHRAMBLER/files/6528148/simu2.zip Thank you !
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846564090, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV4IGV5V5KO5Z4YFYL3TPD7ZJANCNFSM45AHORSQ.
OK, I'll give it a try. Thanks !
Why did I get something that looks like the reconstructed genome in SFs/block_list.txt
? It's very very close to the expectation
The block_list.txt file just contains reference-based coordinates of synteny blocks you defined in the Conserved.Segments file. So they are blocks in human genome, not in ancestor, and the program predicts the most probable order and orientation of them in the target ancestor.
On May 24, 2021, at 1:51 AM, Matthieu Muffato @.***> wrote:
OK, I'll give it a try. Thanks ! Why did I get something that looks like the reconstructed genome in SFs/block_list.txt ? It's very very close to the expectation
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846592598, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PVYY3WPHOADBK5KVRLDTPEW7PANCNFSM45AHORSQ.
Still the same discrepancy :(
$ ../DESCHRAMBLER.pl $PWD/params.txt
## Constructing syntenic fragments ##
/Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/create_blocklist.pl Homo_sapiens /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs
/Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/wrap_recon_apcf.pl /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt 1000 Homo_sapiens 0.0001 /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K
TREE ((((Homo_sapiens:1.5611,Mus_musculus:2.1014):1.5229,Canis_lupus_familiaris:1.1084)@:1.6692,Monodelphis_domestica:1.6262):1.1361,Gallus_gallus:1.1042);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
14984 Homo_sapiens /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs/bpdist.txt
Estimate JC parameter: 9.65081390939419e-06
readLeafGenomes: Homo_sapiens
readLeafGenomes: Mus_musculus
readLeafGenomes: Canis_lupus_familiaris
T=14984
Initializing Homo_sapiens (ingroup)
Initializing Mus_musculus (ingroup)
Initializing Canis_lupus_familiaris (ingroup)
Initializing Gallus_gallus (outgroup)
Initializing Monodelphis_domestica (outgroup)
Computing posterior probabilities ...
Minimum weight = 0.0001
Conservation score file = /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs/block_consscores.txt
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
Use of uninitialized value $len in addition (+) at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/sort_apcfs.pl line 43, <F> line 137.
- Totally 115 APCFs
$ grep -c '^>' Ancestor.APCF
1
$ grep -c '^#' Ancestor.APCF
63
$ grep -c '\$$' Ancestor.APCF
115
and the IDs in Ancestor.APCF
seem to differ from the other files, e.g. block_list.txt
because none of the adjacency can be found in the true ancestral genome.
OK. Can you send me your files again?
- 오전 6:53, Matthieu Muffato @.***> 작성:
Still the same discrepancy :(
$ ../DESCHRAMBLER.pl $PWD/params.txt
Constructing syntenic fragments
/Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/create_blocklist.pl Homo_sapiens /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/wrap_recon_apcf.pl /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt 1000 Homo_sapiens 0.0001 /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K TREE ((((Homo_sapiens:1.5611,Mus_musculus:2.1014):1.5229,Canis_lupus_familiaris:1.1084)@:1.6692,Monodelphis_domestica:1.6262):1.1361,Gallus_gallus:1.1042);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. 14984 Homo_sapiens /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/tree.txt /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs/bpdist.txt Estimate JC parameter: 9.65081390939419e-06 readLeafGenomes: Homo_sapiens readLeafGenomes: Mus_musculus readLeafGenomes: Canis_lupus_familiaris T=14984 Initializing Homo_sapiens (ingroup) Initializing Mus_musculus (ingroup) Initializing Canis_lupus_familiaris (ingroup) Initializing Gallus_gallus (outgroup) Initializing Monodelphis_domestica (outgroup) Computing posterior probabilities ... Minimum weight = 0.0001 Conservation score file = /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/matt/APCFs.1K/SFs/block_consscores.txt UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. Use of uninitialized value $len in addition (+) at /Users/mm49/workspace/agora/simulations/DESCHRAMBLER/script/sort_apcfs.pl line 43,
line 137.
- Totally 115 APCFs $ grep -c '^>' Ancestor.APCF 1 $ grep -c '^#' Ancestor.APCF 63 $ grep -c '\$$' Ancestor.APCF 115 and the IDs in Ancestor.APCF seem to differ from the other files, e.g. block_list.txt because none of the adjacency can be found in the true ancestral genome.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Sure :) simu2b.zip
I found some problems in your Conserved.Segments file.
You need to use positive numbers, not 0, for all blocks, the line for a reference species needs to be appear first in a block definition, and the orientation of the reference block should be always +.
For example, I found the following block definition.
0 Gallus_gallus.16:415000-416000 - Monodelphis_domestica.4:5000-6000 - Canis_lupus_familiaris.15:651000-652000 + Homo_sapiens.13:648000-649000 + Mus_musculus.13:206000-207000 +
It needs to be changed as follows.
1 Homo_sapiens.13:648000-649000 + Gallus_gallus.16:415000-416000 - Monodelphis_domestica.4:5000-6000 - Canis_lupus_familiaris.15:651000-652000 + Mus_musculus.13:206000-207000 +
And in the case of the following block,
2 Gallus_gallus.9:40000-41000 - Monodelphis_domestica.2:641000-642000 - Canis_lupus_familiaris.22:499000-500000 - Homo_sapiens.20:286000-287000 -
It needs to be changed as follows.
2 Homo_sapiens.20:286000-287000 + Gallus_gallus.9:40000-41000 + Monodelphis_domestica.2:641000-642000 + Canis_lupus_familiaris.22:499000-500000 +
I not sure but I think the block order of other species does not matter.
And if there is not any aligned blocks from other species to the reference, it should be removed in the Conserved.Segments file. The example is below.
14983 Homo_sapiens.12:167000-168000 +
Please fix the above problems, and try again.
On May 24, 2021, at 9:49 AM, Matthieu Muffato @.***> wrote:
Sure :) simu2b.zip https://github.com/jkimlab/DESCHRAMBLER/files/6529209/simu2b.zip — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-846660922, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV6XAWMNG2QWLRFKTJDTPGPAZANCNFSM45AHORSQ.
Ah, I see. Yes, I can definitely change that 👍 . Thank you !
Hi,
I applied those fixes and got a clean run, but there is still the same weirdness in Ancestor.APCF
. The only changes I have are:
$ git diff
diff --git a/DESCHRAMBLER.pl b/DESCHRAMBLER.pl
index 134e6f5..7b66d1c 100755
--- a/DESCHRAMBLER.pl
+++ b/DESCHRAMBLER.pl
@@ -2,10 +2,11 @@
use strict;
use warnings;
-use FindBin qw($Bin);
use Cwd;
use Cwd 'abs_path';
+my $Bin = '/Users/mm49/simulations/DESCHRAMBLER';
+
# check the number of argument
if ($#ARGV+1 != 1) {
print STDERR "Usage: ./DESCHRAMBLER.pl <parameter file>\n";
@@ -45,7 +46,7 @@ my $cwd = getcwd();
`sed -e 's:<willbechanged>:$Bin/code/makeBlocks:;s:<treewillbechanged>:$params{"TREEFILE"}:' $params{"MAKESFSFILE"} > $sf_dir/Makefile`;
#}
chdir($sf_dir);
-`make all`;
+`make Genomes.Order`;
chdir($cwd);
`$Bin/script/create_blocklist.pl $params{"REFSPC"} $sf_dir`;
diff --git a/script/estimate_bpdist4.pl b/script/estimate_bpdist4.pl
index e440285..5af8cfe 100755
--- a/script/estimate_bpdist4.pl
+++ b/script/estimate_bpdist4.pl
@@ -52,7 +52,7 @@ foreach my $tar_spc (@inspcs) {
`cp $data_dir/Makefile $out_dir/`;
chdir($out_dir);
- `make pair`;
+ `make Genomes.Order`;
open(F,"Genomes.Order");
my $num_tarscf = 0;
$ diff examples/Makefile.SFs matt/Makefile.SFs
37c37
< Conserved.Segments: Orthology.Blocks
---
> XConserved.Segments: Orthology.Blocks
45a46,48
> Conserved.Segments:
> /Users/mm49/simulations/simu2/filter_CS.py $F /Users/mm49/simulations/simu2/Conserved.Segments > $@
>
$ diff examples/params.txt matt/params.txt
2c2
< REFSPC=spc1
---
> REFSPC=Homo_sapiens
5c5
< OUTPUTDIR=APCFs.300K
---
> OUTPUTDIR=APCFs.1K
9c9
< RESOLUTION=300000
---
> RESOLUTION=1000
$ diff examples/config.SFs.tmp matt/config.SFs
14,16c14,18
< spc1 0 1
< spc2 1 1
< spc3 2 1
---
> Canis_lupus_familiaris 1 1
> Gallus_gallus 2 1
> Homo_sapiens 0 1
> Monodelphis_domestica 2 1
> Mus_musculus 1 1
Some comments:
make all
and make pairwith
make Genomes.Orderand changed the rule
Conserved.Segments` so that it doesn't try to access the alignment nets/Users/mm49/simulations/simu2/filter_CS.py
is my script to filter the master Conserved.Segments
file down to the species listed in config.file
. It removes blocks when the reference species is the only one left, cf #14983 in your previous reply. By the way, when I remove a block, do I need to decrement the index of the following blocks ?chr
(I noticed this was a requirement of using 1 as the second tagThe output is
$ ../DESCHRAMBLER.pl $PWD/params.txt
## Constructing syntenic fragments ##
TREE ((((Homo_sapiens:1.5611,Mus_musculus:2.1014):1.5229,Canis_lupus_familiaris:1.1084)@:1.6692,Monodelphis_domestica:1.6262):1.1361,Gallus_gallus:1.1042);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49//simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
14984 Homo_sapiens /Users/mm49/simulations/DESCHRAMBLER/matt/tree.txt APCFs.1K/SFs/bpdist.txt
Estimate JC parameter: 1.31554210696016e-05
readLeafGenomes: Homo_sapiens
readLeafGenomes: Mus_musculus
readLeafGenomes: Canis_lupus_familiaris
T=14984
Initializing Homo_sapiens (ingroup)
Initializing Mus_musculus (ingroup)
Initializing Canis_lupus_familiaris (ingroup)
Initializing Gallus_gallus (outgroup)
Initializing Monodelphis_domestica (outgroup)
Computing posterior probabilities ...
Minimum weight = 0.0001
Conservation score file = APCFs.1K/SFs/block_consscores.txt
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
- Totally 115 APCFs
and the consistency tests:
$ grep -c '^#' APCFs.1K/Ancestor.APCF
63
9$ grep -c '\$$' APCFs.1K/Ancestor.APCF
115
Can you share again the working directory containing all fixed data files in all subdirectories?
- 오전 8:10, Matthieu Muffato @.***> 작성:
Hi, I applied those fixes and got a clean run, but there is still the same weirdness in Ancestor.APCF. The only changes I have are:
$ git diff diff --git a/DESCHRAMBLER.pl b/DESCHRAMBLER.pl index 134e6f5..7b66d1c 100755 --- a/DESCHRAMBLER.pl +++ b/DESCHRAMBLER.pl @@ -2,10 +2,11 @@
use strict; use warnings; -use FindBin qw($Bin); use Cwd; use Cwd 'abs_path';
+my $Bin = '/Users/mm49/simulations/DESCHRAMBLER'; +
check the number of argument
if ($#ARGV+1 != 1) { print STDERR "Usage: ./DESCHRAMBLER.pl
\n"; @@ -45,7 +46,7 @@ my $cwd = getcwd(); sed -e 's:<willbechanged>:$Bin/code/makeBlocks:;s:<treewillbechanged>:$params{"TREEFILE"}:' $params{"MAKESFSFILE"} > $sf_dir/Makefile
;}
chdir($sf_dir); -
make all
; +make Genomes.Order
;chdir($cwd);
$Bin/script/create_blocklist.pl $params{"REFSPC"} $sf_dir
; diff --git a/script/estimate_bpdist4.pl b/script/estimate_bpdist4.pl index e440285..5af8cfe 100755 --- a/script/estimate_bpdist4.pl +++ b/script/estimate_bpdist4.pl @@ -52,7 +52,7 @@ foreach my $tar_spc @.***) {cp $data_dir/Makefile $out_dir/
;chdir($out_dir);
make pair
;
make Genomes.Order
;open(F,"Genomes.Order"); my $num_tarscf = 0;
$ diff examples/Makefile.SFs matt/Makefile.SFs 37c37 < Conserved.Segments: Orthology.Blocks
XConserved.Segments: Orthology.Blocks 45a46,48 Conserved.Segments: /Users/mm49/simulations/simu2/filter_CS.py $F /Users/mm49/simulations/simu2/Conserved.Segments > $@
$ diff examples/params.txt matt/params.txt 2c2 < REFSPC=spc1
REFSPC=Homo_sapiens 5c5 < OUTPUTDIR=APCFs.300K
OUTPUTDIR=APCFs.1K 9c9 < RESOLUTION=300000
RESOLUTION=1000
$ diff examples/config.SFs.tmp matt/config.SFs 14,16c14,18 < spc1 0 1 < spc2 1 1 < spc3 2 1
Canis_lupus_familiaris 1 1 Gallus_gallus 2 1 Homo_sapiens 0 1 Monodelphis_domestica 2 1 Mus_musculus 1 1 Some comments:
I replaced make all and make pairwithmake Genomes.Orderand changed the ruleConserved.Segments` so that it doesn't try to access the alignment nets /Users/mm49/simulations/simu2/filter_CS.py is my script to filter the master Conserved.Segments file down to the species listed in config.file. It removes blocks when the reference species is the only one left, cf #14983 in your previous reply. By the way, when I remove a block, do I need to decrement the index of the following blocks ? All my chromosome names now start with chr (I noticed this was a requirement of using 1 as the second tag The output is
$ ../DESCHRAMBLER.pl $PWD/params.txt
Constructing syntenic fragments
TREE ((((Homo_sapiens:1.5611,Mus_musculus:2.1014):1.5229,Canis_lupus_familiaris:1.1084)@:1.6692,Monodelphis_domestica:1.6262):1.1361,Gallus_gallus:1.1042);
UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49//simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. 14984 Homo_sapiens /Users/mm49/simulations/DESCHRAMBLER/matt/tree.txt APCFs.1K/SFs/bpdist.txt Estimate JC parameter: 1.31554210696016e-05 readLeafGenomes: Homo_sapiens readLeafGenomes: Mus_musculus readLeafGenomes: Canis_lupus_familiaris T=14984 Initializing Homo_sapiens (ingroup) Initializing Mus_musculus (ingroup) Initializing Canis_lupus_familiaris (ingroup) Initializing Gallus_gallus (outgroup) Initializing Monodelphis_domestica (outgroup) Computing posterior probabilities ... Minimum weight = 0.0001 Conservation score file = APCFs.1K/SFs/block_consscores.txt UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94. UNIVERSAL->import is deprecated and will be removed in a future perl at /Users/mm49/simulations/DESCHRAMBLER/script/../lib/perl/Bio/Tree/TreeFunctionsI.pm line 94.
- Totally 115 APCFs and the consistency tests:
$ grep -c '^#' APCFs.1K/Ancestor.APCF 63 9$ grep -c '\$$' APCFs.1K/Ancestor.APCF 115 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Sure :) I was expecting that but didn't want to jump the gun simu2c.zip
All input data files are now OK, but I found that there was a problem in our script when sorting final APCFs with the same length.
I modified the script “sort_apcfs.pl” in the “script” directory. Can you re-download the script and try again?
On May 26, 2021, at 11:49 PM, Matthieu Muffato @.***> wrote:
Sure :) I was expecting that but didn't want to jump the gun simu2c.zip https://github.com/jkimlab/DESCHRAMBLER/files/6547565/simu2c.zip — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jkimlab/DESCHRAMBLER/issues/6#issuecomment-848833599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEU6PV7IEVUEOHMIXD76VIDTPUC55ANCNFSM45AHORSQ.
Ah, super, thanks !
All right. Everything's now fine. I could parse Ancestor.APCF
and match all the IDs 👍
Thank you so much for your help !
Hello,
I want to test DESCHRAMBLER on simulated genomes (made as lists of genes), and I would like to know how I can make it not require the genome alignments (no sequences are simulated).
I have generated a
Conserved.Segments
file from my simulations. Like insim.data.new.tar.gz
(the simulations in your 2017 PNAS paper), each gene takes 1kb of space in the genome:I hacked
Makefile.SFs
to use his file instead, and I have changed theRESOLUTION
parameter to 1kb, but it still tries to access the alignments to estimate the breakpoint distances, and the resulting reconstructions are 1 APCF per entry inConserved.Segments
- i.e. no reconstruction.Could you please help ?
Regards, Matthieu