Closed chrisquince closed 5 years ago
Issue 1 : merged bins are called "binmerged" in Common_unitigs.py which make it fails to parse them on the second run. Solution : Common_unitigs.py takes the list of bins as arguments.
Issue 2 : binmerged_3 and Bin_137 are missing from bin_init/bin_cogs_to_ignore.tsv. They possess too many shared COG, 10 of them. But not really in common to each other, so they should not be run. Corrected.
All changed have been pushed and /mnt/gpfs/Hackathon/FMTMeren/CoAssembly77_Rerun is running.
OK so the changed script does not actually run the merged bins. I think that is because the naming convention fails to match:
def bin_paths_by_type(bintype): return [os.path.dirname(path) for path in glob.glob("subgraphs/%s/Bin*/SCG.fna" % bin_type)]
I am going to try rationalising the naming convention to see if this fixes it.
OK pushed changes to Common_Unitigs.py that seemed to fix this. Will close when I have tested on FMTMeren.
Yes this seems to work so am now closing the issue.
Running the FMT data set from Meren throws an error from the flag_bad_cogs rule for Bin_134. I find this step very hard to debug and I would have said it would be a good candidate for refactoring?
The run is here:
/mnt/gpfs/Hackathon/FMTMeren/CoAssembly77_Rerun