LCR-BCCRC / lcr-modules

Collection of standard analytical pipelines for genomic and transcriptomic data
https://lcr-modules.rtfd.io
MIT License
15 stars 7 forks source link

vcf2maf: New custom_enst argument breaks when empty #159

Closed lkhilton closed 2 years ago

lkhilton commented 3 years ago

V1.2 of vcf2maf can't be run without a custom_enst value for every genome build. It throws this error:

Option custom-enst requires an argument

This is a bit of a circularity problem since you have to first run vcf2maf to find non-canonical transcripts to supply to this argument. Maybe it need an if statement to test if the field is empty an run a different command based on that test?

Kdreval commented 3 years ago

I was able to run with an empty file when testing. Can you point it to a dummy file?

lkhilton commented 3 years ago

Will try and report back, thanks!

rdmorin commented 3 years ago

If this works can we set it up and document it in the config to make it more clear that users should leave the empty/dummy file unless they replace it with their own?

Kdreval commented 3 years ago

Absolutely! I did it in demo config, should have thought about the default config as well

        # here you can specify path to txt file with a list of custom ENST IDs that override canonical selection
        # it will be parsed to --custom-enst flag of vcf2maf
        # if no non-canonical transcript IDs to be included, leave switches empty
        # This is just an example of how to include the list of custom IDs
        switches:
            custom_enst:
              hg38: ""
              grch37: "data/custom_enst.txt"
              hs37d5: ""

EDITED: Not even like this, the path should be for each build. I will update it once Laura confirms it worked for her.

lkhilton commented 3 years ago

This is the result with an empty file:

ERROR: Provided --custom-enst file is missing or empty: etc/dummy_enst.txt
rdmorin commented 3 years ago

It probably worked for Kostia because he has a copy of that file in his gambl-repo "etc" folder. The config probably just needs to be changed to refer to the one under MODS_DIR or whatever it's called.

Kdreval commented 3 years ago

it might be the case. However, I also tested it for the demo sample, which is another repo, and another directory. But most likely it worked for me because I have a file there as well. I am setting it up to test outside GSC network, to see if I can come up with the same error.

rdmorin commented 3 years ago

In my Battenberg config I set the path to src this way:

src_dir: "{MODSDIR}/src/"

I presume this will fix the issue:

grch37: "{MODSDIR}/etc/custom_enst.txt"

Did "data" get substituted with "etc" in a recent commit, perhaps?

Kdreval commented 3 years ago

I found it a bit unusual but the issue is that the empty file cannot be 0 b. I had a newline (just a return) in my file, so it was 5 b in size and therefore technically not emty. I found that if I remove the newline from a file (making it 0 b), vcf2maf exits with the same error complaining about the empty file. In any case, the switches in config should not be empty contrary to what I put in the comments, so I will update the module with a patch. As an interim solution, would you mind adding a newline to the dummy file? Thanks for testing it and reporting this issue!

lkhilton commented 3 years ago

The newline was the key, thanks!

rdmorin commented 3 years ago

Can this be closed?

Kdreval commented 3 years ago

This has been resolved. With this patch I wanted to push also update on demo module where vcf2maf is running. I am still working on that update. I will close the issue once issue PR.

Kdreval commented 2 years ago

I have added documentation for this in the PR https://github.com/LCR-BCCRC/lcr-modules/pull/165