galaxyproject / tools-devteam

Contains a set of Galaxy Tools mostly written by the Galaxy Team.
37 stars 92 forks source link

Allow non-coordinate-sorted bam input to eXpress #581

Open gnaisha opened 3 years ago

gnaisha commented 3 years ago

eXpress requires randomly sorted bams as input, but having the input type set to "bam" causes the bam to be coordinate sorted before running. This PR adds "qname_sorted.bam" and "unsorted.bam" input types to avoid this sorting.

bernt-matthias commented 3 years ago

Thanks for the contribution. The checks are currently not running because github actions changed a bit. They should run again after https://github.com/galaxyproject/tools-devteam/pull/582 is merged.

Q: Should the sorted bam types be removed? Also a bump of the tool version is necessary.

bernt-matthias commented 3 years ago

Can you rebase the PR branch. Then Tests will run.

gnaisha commented 3 years ago

Thanks for the fixes to allow the checks to pass.

I tried removing the sorted bam input from the wrapper, but in my local tests it seems that sorted bam is still accepted as input, causing eXpress to fail. I've made the change in the wrapper however, since this should be an invalid input type, and will file an issue with Galaxy main regarding disallowing sorted bam in such cases.

I see that there is a tool version in the wrapper, and an eXpress version in tool_dependencies.xml. These are both set to 1.1.1 currently - is this coincidence? I assume I should only bump the tool wrapper version number.

bernt-matthias commented 3 years ago

I see that there is a tool version in the wrapper, and an eXpress version in tool_dependencies.xml. These are both set to 1.1.1 currently - is this coincidence? I assume I should only bump the tool wrapper version number.

You can remove the file tool_dependencies.xml

gnaisha commented 3 years ago

I've removed that file, and updated eXpress to the most recent version I'm seeing in bioconda.

bernt-matthias commented 3 years ago

I should have asked this earlier: What is the exact problem with sorted bam file. The manual suggests that sorting is necessary:

If you aligned your reads with Bowtie, your alignments will be properly ordered already. If you used another tool, you should ensure that they are properly sorted

gnaisha commented 3 years ago

Sorry I should have been explicit referring to "sorted" - eXpress does require sorted input, but it needs to be sorted by read name rather than coordinate.

You can sort your BAM using this command:

samtools sort -n hits.bam hits.sorted

Galaxy seems to define the "bam" type as being coordinate sorted, and when I used a query-name sorted bam as input to the current eXpress wrapper, Galaxy converted it to a coordinate-sorted bam before running eXpress, causing eXpress to fail. Adding qname_sorted.bam type allows this bam to be used as is, sorted by read name.

bwlang commented 3 years ago

Maybe the wrapper should sort on a pipe? For this and other tools that need oddball bags

Brad

On Dec 31, 2020, at 5:39 PM, gnaisha notifications@github.com wrote:

 EXTERNAL SENDER

Sorry I should have been explicit referring to "sorted" - eXpress does require sorted input, but it needs to be sorted by read name rather than coordinate.

You can sort your BAM using this command:

samtools sort -n hits.bam hits.sorted

Galaxy seems to define the "bam" type as being coordinate sorted, and when I used a query-name sorted bam as input to the current eXpress wrapper, Galaxy converted it to a coordinate-sorted bam before running eXpress, causing eXpress to fail. Adding qname_sorted.bam type allows this bam to be used as is, sorted by read name.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgalaxyproject%2Ftools-devteam%2Fpull%2F581%23issuecomment-753219446&data=04%7C01%7Clanghorst%40neb.com%7C023d4bba43114d5114c008d8addce926%7C77cefbc6b3d64d6a9f740664881c384b%7C0%7C0%7C637450511622470662%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WY%2FqsXx3%2BfvtDC4WF9ptzKAkxAPsoALUv4jd%2BMpAWGQ%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAPBRB4AUEWYMRVMGVWMP3SXT4RNANCNFSM4VFVNJIA&data=04%7C01%7Clanghorst%40neb.com%7C023d4bba43114d5114c008d8addce926%7C77cefbc6b3d64d6a9f740664881c384b%7C0%7C0%7C637450511622480656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2Fr1MjNLkOV%2FvHKjsKIaLt1%2FKbz7VNnGwvPxOqKfbLZc%3D&reserved=0.