ed-lau / jcast

Junction centric alternative splicing translator
MIT License
19 stars 3 forks source link

not found #16

Closed naqvia closed 2 years ago

naqvia commented 2 years ago

I keep getting a

Traceback (most recent call last):
  File "/Users/naqvia/Desktop/yes/envs/JCAST/bin/jcast", line 8, in <module>
    sys.exit(main())
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 438, in main
    args.func(args)
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 61, in runjcast
    assert os.path.exists(os.path.join(args.rmats_folder, 'MXE.MATS.JC.txt')), 'rMATS files not found, check directory.'
AssertionError: rMATS files not found, check directory.

Here is my command: jcast rmats/ ~/Desktop/genome_files/ ~/Desktop/genome_files/ I have a bunch of SE.MATS.JC.txt and MXE.MATS.JC.txt in the folder. Tried a few things but nothing. Please advise.

-a

ed-lau commented 2 years ago

Have you tried using absolute paths for your rMATS folder?

On a related note, the second and third arguments should refer to the annotation (gtf) and genome (fa/fa.gz) files directly rather than the directories.

naqvia commented 2 years ago

Yes I have tried absolute paths and did what you suggested. When I modify the main.py script and hardcode the rmats output file name and path, I think it does go forward but I then get an error saying:

Traceback (most recent call last): File "/Users/naqvia/Desktop/yes/envs/JCAST/bin/jcast", line 8, in sys.exit(main()) File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 441, in main args.func(args) File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 120, in runjcast for rma in [rmats_results.rmats_mxe, AttributeError: 'str' object has no attribute 'rmats_mxe'

From: Edward Lau @.> Reply-To: ed-lau/jcast @.> Date: Tuesday, July 26, 2022 at 10:31 AM To: ed-lau/jcast @.> Cc: "Naqvi, Ammar S" @.>, Author @.***> Subject: [External]Re: [ed-lau/jcast] not found (Issue #16)

Have you tried using absolute paths for your rMATS folder?

On a related note, the second and third argumetns should link to the annotation (gtf) and genome (fa/fa.gz) files directly rather than the folder.

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fed-lau%2Fjcast%2Fissues%2F16%23issuecomment-1195561198&data=05%7C01%7CNAQVIA%40chop.edu%7C9037a77923a24e28d08c08da6f137c67%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C637944426731496595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=h7Tt92hywMoZfjf1hCrrIwD0zJyfpNtVdTbEZhGSd5s%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADIIPCHX6PY2XKI4GSFXZVLVV7ZC3ANCNFSM54TRSHHQ&data=05%7C01%7CNAQVIA%40chop.edu%7C9037a77923a24e28d08c08da6f137c67%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C637944426731496595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zsy1FXsTU%2BXUjTr75iCtY2zS6MkoSSt40M50ovH%2FewY%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.*> This email originated from an EXTERNAL sender to CHOP. Proceed with caution when replying, opening attachments, or clicking links. Do not disclose your CHOP credentials, employee information, or protected health information to a potential hacker**.

ed-lau commented 2 years ago

Can you show me which parts you modified, and with the unmodified JCAST, could you copy your commands with absolute paths and the associated error messages here?

naqvia commented 2 years ago

I just added a line and commented out others (see below):

#assert os.path.exists(os.path.join(args.rmats_folder, 'MXE.MATS.JC.txt')), 'rMATS files not found, check directory.'
#rmats_results = RmatsResults(rmats_dir=args.rmats_folder)
rmats_results = "/Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats/09ac93e6-8f27-40ea-bc31-46639bc3ef8b.control-XX_vs_BS_SG2KB6.MXE.MATS.JC.txt"

As for the command and error with the modified version:

(JCAST) MacBook-Pro-2:splicing-based_neoepitope_discovery naqvia$ jcast /Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats/ /Users/naqvia/Desktop/genome_files/gencode.v38.annotation.gtf /Users/naqvia/Desktop/genome_files/GRCh38.primary_assembly.genome.fa

Traceback (most recent call last):
  File "/Users/naqvia/Desktop/yes/envs/JCAST/bin/jcast", line 8, in <module>
    sys.exit(main())
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 441, in main
    args.func(args)
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 120, in runjcast
    for rma in [rmats_results.rmats_mxe,
AttributeError: 'str' object has no attribute 'rmats_mxe'

With the original version:

(JCAST) MacBook-Pro-2:splicing-based_neoepitope_discovery naqvia$ jcast /Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats/ /Users/naqvia/Desktop/genome_files/gencode.v38.annotation.gtf /Users/naqvia/Desktop/genome_files/GRCh38.primary_assembly.genome.fa
/Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats/
Traceback (most recent call last):
  File "/Users/naqvia/Desktop/yes/envs/JCAST/bin/jcast", line 8, in <module>
    sys.exit(main())
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 441, in main
    args.func(args)
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 64, in runjcast
    assert os.path.exists(os.path.join(args.rmats_folder, 'MXE.MATS.JC.txt')), 'rMATS files not found, check directory.'
AssertionError: rMATS files not found, check directory.
ed-lau commented 2 years ago

Did you try renaming your rMATS files from "09ac93e6-8f27-40ea-bc31-46639bc3ef8b.control-XX_vs_BS_SG2KB6.MXE.MATS.JC.txt" to the standard output names, i.e., just MXE.MATS.JC.txt etc.?

Also for the first case with the modified line, that wouldn't work because the code expects an RmatsResults object. You could potentially try rmats_results = RmatsResults(rmats_dir="/Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats") after renaming the rMATS files to their standard names.

naqvia commented 2 years ago

I renamed the file to the standard output name and reverted to the original code and still get an error:

(JCAST) MacBook-Pro-2:splicing-based_neoepitope_discovery naqvia$ jcast /Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats /Users/naqvia/Desktop/genome_files/gencode.v38.annotation.gtf /Users/naqvia/Desktop/genome_files/GRCh38.primary_assembly.genome.fa
Traceback (most recent call last):
  File "/Users/naqvia/Desktop/yes/envs/JCAST/bin/jcast", line 8, in <module>
    sys.exit(main())
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 421, in main
    args.func(args)
  File "/Users/naqvia/Desktop/yes/envs/JCAST/lib/python3.7/site-packages/jcast/main.py", line 61, in runjcast
    assert os.path.exists(os.path.join(args.rmats_folder, 'SE.MATS.JC.txt')), 'rMATS files not found, check directory.'
AssertionError: rMATS files not found, check directory.

with your modified line suggestion I get the same error. The files in the rmats dir:

(JCAST) MacBook-Pro-2:splicing-based_neoepitope_discovery naqvia$ ls /Users/naqvia/Desktop/splicing-based_neoepitope_discovery/rmats
MXE.MATS.JC.txt
ed-lau commented 2 years ago

JCAST expects a typical unaltered rMATS output folder. See if you can make sure the five splice types from rMATS are all in the rMATS output directory, named exactly as MXE.MATS.JC.txt, SE.MATS.JC.txt, RI.MATS.JC.txt, A3SS.MATS.JC.txt, A5SS.MATS.JC.txt.

naqvia commented 2 years ago

Thank you! That resolved the issues. Are there any plans to make the program more scaleable and customizable? For example, what if we have multiple rMATS run or just want to assess "SE" splicing cases.

ed-lau commented 2 years ago

Thanks for the input For multiple rMATS runs you should be able to use a batch script or pipeline to run JCAST multiple times. For different splice types, we can potentially include an option to write the different splice types into different output files. For now you should be able to filter the splice types based on the FASTA entry headers.