Open RiescoR opened 2 years ago
Ok, I have found the problem. The error is located in the script runner_personal.py. shutil.copytree is resulting in an endless loop that is triggered by copying a directory into a subdirectory of itself (please see https://gitlab.com/apparmor/apparmor/-/issues/62) The problem will cause a crash when the master directory has many files, as they will be copied recursively almost endlessly and deplete the disk available memory. The solution is to omit the _conspecifi folder when the script folder is created. I've made the following changes to the runner_personal.py script that resolved the problem (see lines 5, and 53 of the attached code). If you have the same problem you can directly replace the content of the runner_personal.py file with the attached code.
Actually, the best way to proceed if you do not want to change anything is just to have your genomic data outside of the conspecifix folder, that will prevent the copy loop. Usearch also gives problems if you place the binary on the conspecifix folder, always place it outside, ideally in your $PATH folder.
I have an issue when running the pipeline with a personal comparison of my own genomes. I used the proposed command line (python runner_personal.py -t 4 /absolute/path/to/folder/with/genes/) and in the "folder with genes" it creates a ¨_conspecific¨ folder, that contains a "database"and a ¨script" folder. The problem is that the pipeline systematically creates copies of the master conspecifix folder within the ¨scripts" folder, and then tries to create again the same folder structure, giving an infinite master>folder with genes> script >master>folder with genes> script >... It is copying the same folder again and again in the scripts directory, filling up the harddrive until the maximum writing path length is achieved or until the drive is full, giving an error prompt. Is this how the pipeline is suposed to work? Thanks!