wodanaz / Assembling_viruses

0 stars 0 forks source link

Remove path from filename column in depth/genotype compiler #31

Closed johnbradley closed 3 years ago

johnbradley commented 3 years ago

The run-depth-compiler.sh and run-genotype-compiler.sh scripts create TSV files that look something like this:

    VIC10508    
100 1   
3292    1   
101 1

The first row in this file consists of a tab followed by a column name. The column name is based on the name of another file. The code removes the suffix (".depth.tab" or ".filt.tab") from the other filename and uses that for a column name. For example VIC10508.depth.tab would be converted to VIC10508.

While working on changes for issue https://github.com/wodanaz/Assembling_viruses/issues/25#issuecomment-788233739 I noticed that this code does not strip the trailing path. So a path like /data/jpb/tmp.68LzubWRvd/VIC10508.depth.tab would be converted to /data/jpb/tmp.68LzubWRvd/VIC10508.

Should we remove the directory from this column name as well?

Code that removes suffixes

https://github.com/wodanaz/Assembling_viruses/blob/a5dd984b82dca5da538210b64cfcf1a8fa912a10/scripts/run-depth-compiler.sh#L13

https://github.com/wodanaz/Assembling_viruses/blob/a5dd984b82dca5da538210b64cfcf1a8fa912a10/scripts/run-genotype-compiler.sh#L13

wodanaz commented 3 years ago

Yes, we don't need the path for that file. Thanks!