Closed anacristinareis closed 3 years ago
Olá @anacristinareis , can you please confirm you are running get_homologues.pl with param -c ? Are you also using -m cluster? In that case it needs to re-compute homologues, to tell core- from pan-genes, every time a new genome is added. Because these comparison depend on the order, many will need to be computed the first time they're needed, and then will be re-used if encountered again. If you are not using -m cluster then I strongly recommend you combine -m dryrun with GNU parallel as explained in http://eead-csic-compbio.github.io/get_homologues/manual/manual.html#dryrun and on a recent thread (https://github.com/eead-csic-compbio/get_homologues/issues/72), that would run those operation in parallel as much as possible, hope this helps, Bruno
Hello,
I'm using this code: "./get_homologues.pl -d Samples -M -D -t0 -c ".
Can I stop my analysis, because it is still running, and run "./ get_homologues.pl -d Samples -M -D -t0 -c-m dryrun".
Thanks for your help. Ana Reis
brunocontrerasmoreira @.***> escreveu no dia sexta, 21/05/2021 à(s) 15:42:
Olá @anacristinareis https://github.com/anacristinareis , can you please confirm you are running get_homologues.pl with param -c ? Are you also using -m cluster? In that case it needs to re-compute homologues, to tell core- from pan-genes, every time a new genome is added. Because these comparison depend on the order, many will need to be computed the first time they're needed, and then will be re-used if encountered again. If you are not using -m cluster then I strongly recommend you combine -m dryrun with GNU parallel as explained in http://eead-csic-compbio.github.io/get_homologues/manual/manual.html#dryrun and on a recent thread (#72 https://github.com/eead-csic-compbio/get_homologues/issues/72), that would run those operation in parallel as much as possible, hope this helps, Bruno
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eead-csic-compbio/get_homologues/issues/74#issuecomment-846001009, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANYUHQXMF55XRMUYQE4OCSTTOZWOLANCNFSM45I2DJYA .
Hi, you can stop that process, sure. Then please run $ ls -ltr Samples_homologues/tmp | tail and share the output here
Hi,
Only to confirm, the only script I need to write is " ls -ltr Samples_homologues/tmp | tail".
Sorry, for my question.
Thanks, Ana Reis
brunocontrerasmoreira @.***> escreveu no dia sexta, 21/05/2021 à(s) 16:19:
Hi, you can stop that process, sure. Then please run $ ls -ltr Samples_homologues/tmp | tail and share the output here
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eead-csic-compbio/get_homologues/issues/74#issuecomment-846028086, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANYUHQRF3MPB3X5V32BALD3TOZ2YJANCNFSM45I2DJYA .
Yes, after killing the get_homs process. That ls command will allow you to see the size and timestamp of the homologos saved results
Hi,
Here is the result output:
(base) MBP-de-Ana:get_homologues anareis$ ls -ltr Mbovis_homologues/tmp | tail
-rw-r--r-- 1 anareis staff 75683 21 Mai 17:39 homologues_2397.gbk_Mb1841.gbk
-rw-r--r-- 1 anareis staff 75449 21 Mai 17:39 homologues_2397.gbk_SRR1791984.gbk
-rw-r--r-- 1 anareis staff 69401 21 Mai 17:39 homologues_2397.gbk_Reference.gb
-rw-r--r-- 1 anareis staff 71547 21 Mai 17:39 homologues_2397.gbk_601.gbk
-rw-r--r-- 1 anareis staff 75179 21 Mai 17:39 homologues_2397.gbk_Mb1712.gbk
-rw-r--r-- 1 anareis staff 71020 21 Mai 17:39 homologues_2397.gbk_ERR1203064.gbk
-rw-r--r-- 1 anareis staff 71241 21 Mai 17:39 homologues_2397.gbk_1785.gbk
-rw-r--r-- 1 anareis staff 74909 21 Mai 17:39 homologues_2397.gbk_Mb565.gbk
-rw-r--r-- 1 anareis staff 75503 21 Mai 17:40 homologues_2397.gbk_1339.gbk
-rw-r--r-- 1 anareis staff 75107 21 Mai 17:40 homologues_2397.gbk_NZ_1_Canada.gbk
Thanks, Ana Reis
brunocontrerasmoreira @.***> escreveu no dia sexta, 21/05/2021 à(s) 17:28:
Yes, after killing the get_homs process. That ls command will allow you to see the size and timestamp of the homologos saved results
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eead-csic-compbio/get_homologues/issues/74#issuecomment-846083885, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANYUHQTINN6EDJIQEEOUYYDTO2CZBANCNFSM45I2DJYA .
It all looks good, see that files have a similar size? 1) Check if you have parallel installed by typing it in the terminal, else install it 2) You can now rerun adding -m dryrun , passing the batch file to parallel as explained in the manual
Ok, thanks.
Just to check, after install parallel, the code is: "./get_homologues.pl -d Samples -M -D -t0 -c-m dryrun", or I don't need to add -M -D -t0 -c flags?
Thanks for all your help.
Ana Reis
brunocontrerasmoreira @.***> escreveu no dia sexta, 21/05/2021 à(s) 18:36:
It all looks good, see that files have a similar size?
- Check if you have parallel installed by typing it in the terminal, else install it
- You can now rerun adding -m dryrun , passing the batch file to parallel as explained in the manual
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eead-csic-compbio/get_homologues/issues/74#issuecomment-846125349, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANYUHQQJGTIHGCDMDRJ3IZ3TO2KZFANCNFSM45I2DJYA .
./get_homologues.pl -d Samples -M -D -t0 -c -m dryrun
Thank you very much for your time and help.
Best regards, Ana Reis
brunocontrerasmoreira @.***> escreveu no dia sexta, 21/05/2021 à(s) 19:41:
./get_homologues.pl -d Samples -M -D -t0 -c -m dryrun
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eead-csic-compbio/get_homologues/issues/74#issuecomment-846159549, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANYUHQTL6GRNE7WY33FYRU3TO2SMVANCNFSM45I2DJYA .
Hi,
I'm working on pangenome analysis of Mycobacterium species. After the identification of orthologs between all the 70 genomes and the identification of inparalogs, the analysis progress is being slowly.
find_OMCL_clusters: parsing clusters (/Users.../tmp/all_ortho.mcl).
Splitting clusters by Pfam domain composition
Split Pfam clusters
Sample 0 (1317.gbk)
And, then, it perform all vs all homologs comparisons, using the 70 genomes.
However, when another sample is added, it performs again the comparison between genomes, does not re-use the files generated in the previous step.
Thanks for all your help.