Open sivico26 opened 4 months ago
Hi @sivico26 , I can't make promises, but if you can share one graph, I could look at where the bottleneck might be!
Hi @AndreaGuarracino.
I can do that. Where should I send it? to your mail at uthsc?
By the way, the Job just eclipsed 2200h. At this scale, odgi unchop
is definitively the bottleneck of smoothxg
, taking more than 80% of the time (and counting). If we find a way to address this, people working on crops who want to include wild ancestors in their pangenomes will surely appreciate it.
My cluster's admins speculated that odgi unchop
is $O(n^2)$ (on the number of nodes I imagine). Can you confirm if this is the case? do you know? Knowing this would help us to determine if we could wait for the job or if we should rather proceed with the input graph for our work.
To my UTHSC is fine.
As short answer, I would skip the unchopping in smoothxg in order to work with a smoothed graph. I've never made a fornal complexity analysis of the unchop algorithm, but the issue has a quadratic smell!
Sent from Outlook for Androidhttps://aka.ms/AAb9ysg
From: Simón Villanueva Corrales @.> Sent: Wednesday, September 4, 2024 4:21:23 PM To: pangenome/odgi @.> Cc: Andrea Guarracino @.>; Mention @.> Subject: Re: [pangenome/odgi] Odgi unchop performance for large graphs (Issue #584)
Hi @AndreaGuarracinohttps://github.com/AndreaGuarracino.
I can do that. Where should I send it? to your mail at uthsc?
By the way, the Job just eclipsed 2200h. At this scale, odgi unchop is definitively the bottleneck of smoothxg, taking more than 80% of the time (and counting). If we find a way to address this, people working on crops who want to include wild ancestors in their pangenomes will surely appreciate it.
My cluster's admins speculated that odgi unchop is $O(n^2)$ (on the number of nodes I imagine). Can you confirm if this is the case? do you know? Knowing this would help us to determine if we could wait for the job or if we should rather proceed with the input graph for our work.
— Reply to this email directly, view it on GitHubhttps://github.com/pangenome/odgi/issues/584#issuecomment-2330311512, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AO26XHRU67FGKGHHYIHZZFTZU6IXHAVCNFSM6AAAAABK6BMSQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZQGMYTCNJRGI. You are receiving this because you were mentioned.Message ID: @.***>
Hi @AndreaGuarracino,
My cluster's admins recently informed me that they need to shut down the server where my job is running. Thus, my job will be killed after running ~2600 hours.
Is there a way to resume smoothxg
processing somewhere by copying the current temporary files? These are the files currently in the folder:
[sivico26@urga1 ~]$ ls /scratch/sivico26/job_6857522.cerit-pbs.cerit-sc.cz/results_uv/tmp/temp-27WbAw/ -lh
total 108G
-rw-------. 1 sivico26 meta 98G jun 15 14:41 0LXmlE
-rw-------. 1 sivico26 meta 4,9G jun 10 14:38 E9L5T3
-rw-------. 1 sivico26 meta 4,9G jun 10 14:36 Gsu5sk
-rw-------. 1 sivico26 meta 12K jun 15 14:42 hAFKNe
What do you think, would they be of any use? Thank you in advance
Hello again,
I realized that even though my issue affects
smoothxg
, it is more concerned with anodgi
algorithm, so I am putting it also here for future reference.