faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

Easy Way to Update Contig Symlinks to New Path? #232

Closed alexkrohn closed 3 years ago

alexkrohn commented 3 years ago

Hi there. I was running a Spades assembly, but some of my individuals failed because I hadn't allocated enough memory. Once Spades finished, I re-ran Spades only on the individuals that failed. Now I have ~half of my individuals in a spades-assembly directory and half in a spades-assembly-rety directory. I know I can't combine these into one directory because the symlinks in the contigs/ folder would still point to the old directory. Is there any easy way to update the symlinks to point to the correct path?

Thanks,

Alex

brantfaircloth commented 3 years ago

The easiest way is to use a bash script to get all symlinks into the same folder (somewhere). Let's say you have a setup like:

/old/assembly/critter_1
/old/assembly/critter_2
/old/assembly/critter_3

/new/assembly/critter_4
/new/assembly/critter_5
/new/assembly/critter_6

then you make a new directory to hold all the symlinks:

mkdir /some/new/contigs

change into that new directory:

cd /some/new/contigs

Then, assuming the file you want to link to for each critter_* is named contigs.fasta, run:

for i in /old/assembly/*; do name=`basename $i`; ln -s $i/contigs.fasta $name;done
for i in /new/assembly/*; do name=`basename $i`; ln -s $i/contigs.fasta $name;done

That should do it. You will need to adjust paths, etc. to reflect your data/setup.

alexkrohn commented 3 years ago

Ah! I was missing the basename function. Thanks for your help.

A few notes: 1) I'd be changing all of my taxa names to be ending in *_spades. No big deal, but something to be aware of when I make configuration files in the future.

2) To be consistent with other uses in the pipeline, I might change the loops to have the symlinks end in contigs.fasta, so the commands would be

for i in /old/assembly/*; do name=basename $i; ln -s $i/contigs.fasta $name.contigs.fasta;done

3) Either in the old directories before running the for loop, or after the for loop has run, I also need to remove the old contig/* files which are now useless.

Thanks for your help, Brant.