microbiomedata / mg_annotation

Metagenome Annotation Workflow
4 stars 7 forks source link

Mapping - basic changes #28

Closed kaijli closed 3 months ago

kaijli commented 3 months ago

@aclum @scanon successful run of mapping task, jaws id 79113 the other mapping branch got too convoluted, so this is from an older commit on that branch.

kaijli commented 3 months ago

https://github.com/microbiomedata/mg_annotation/pull/27#pullrequestreview-2023122092 In response to this previous change request:

  1. For this to follow nmdc naming conventions we need to two identifier variables, one for the assembly identifier (ex nmdc:wfmgas-11-0080kf20.1) and one for the annotation identifier (ex nmdc:wfmgan-11-001dky13.1). The assembly identifier is what should get passed in to the make_map_file function for the generation of the mapping file. The annotation file is what should get passed to finish_ano which renames the annotation file names.

@scanon @aclum not sure if this was ever discussed fully or ironed out, but can be an easy thing to implement. I think Shane was looking at how / whether to pull the assy-id from somewhere during automation

  1. The newly named nucleotide fasta file and the mapping file need to be output files for finish_ano. The newly named contig file that is an output of make_map_file should have a suffix of _contigs.fna and the mapping file should have a suffix of _contig_names_mapping.tsv

Done

kaijli commented 3 months ago

made requested changes, jaws id 79268

aclum commented 3 months ago

Looks good.