Closed lydiayliu closed 2 years ago
btw this is what it looks like for fasta entry and the same config except when I have merge_variant_noncoding = 'no'
yiyangliu@ip-0A125212:/hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-10/pipeline-meta-call-NonCanonicalPeptide-0.0.1/ACH-000028/call-NonCanonicalPeptide-1.0.0/ACH-000028/moPepGen-0.5.1/output$ tree
.
├── ACH-000028_variant_peptides_summary.txt
├── decoy
│ ├── ACH-000028_Coding_encode_decoy.fasta
│ ├── ACH-000028_Coding_encode_decoy.fasta.dict
│ ├── ACH-000028_Noncoding-additional_encode_decoy.fasta
│ ├── ACH-000028_Noncoding-additional_encode_decoy.fasta.dict
│ ├── ACH-000028_Noncoding_encode_decoy.fasta
│ └── ACH-000028_Noncoding_encode_decoy.fasta.dict
├── encode
│ ├── ACH-000028_Coding_encode.fasta
│ ├── ACH-000028_Coding_encode.fasta.dict
│ ├── ACH-000028_Noncoding-additional_encode.fasta
│ ├── ACH-000028_Noncoding-additional_encode.fasta.dict
│ ├── ACH-000028_Noncoding_encode.fasta
│ └── ACH-000028_Noncoding_encode.fasta.dict
└── split
├── ACH-000028_Coding.fasta
├── ACH-000028_Noncoding-additional.fasta
└── ACH-000028_Noncoding.fasta
Also fasta entry with merge_variant_noncoding = 'both'
and filtering gives the correct outputs:
yiyangliu@ip-0A125212:/hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_filter_split/pipeline-NonCanonicalPeptide-0.0.1/ACH-000005/call-NonCanonicalPeptide-1.0.0/ACH-000005/moPepGen-0.6.1/output$ tree
.
├── ACH-000005_merged_peptides_filtered.fasta
├── ACH-000005_merged_peptides_filtered_summary.txt
├── ACH-000005_noncoding_peptides_filtered.fasta
├── ACH-000005_variant_peptides_filtered.fasta
├── ACH-000005_variant_peptides_filtered_summary.txt
├── ACH-000005_variant_peptides_summary.txt
├── decoy
│ ├── ACH-000005_Coding_encode_decoy.fasta
│ ├── ACH-000005_Coding_encode_decoy.fasta.dict
│ ├── ACH-000005_merged.fasta
│ ├── ACH-000005_merged_peptides_filtered_encode_decoy.fasta
│ ├── ACH-000005_merged_peptides_filtered_encode_decoy.fasta.dict
│ ├── ACH-000005_Noncoding-additional_encode_decoy.fasta
│ ├── ACH-000005_Noncoding-additional_encode_decoy.fasta.dict
│ ├── ACH-000005_Noncoding_encode_decoy.fasta
│ └── ACH-000005_Noncoding_encode_decoy.fasta.dict
├── encode
│ ├── ACH-000005_Coding_encode.fasta
│ ├── ACH-000005_Coding_encode.fasta.dict
│ ├── ACH-000005_merged_peptides_filtered_encode.fasta
│ ├── ACH-000005_merged_peptides_filtered_encode.fasta.dict
│ ├── ACH-000005_Noncoding-additional_encode.fasta
│ ├── ACH-000005_Noncoding-additional_encode.fasta.dict
| ├── ACH-000005_Noncoding_encode.fasta
│ └── ACH-000005_Noncoding_encode.fasta.dict
└── split
├── ACH-000005_Coding.fasta
├── ACH-000005_Noncoding-additional.fasta
└── ACH-000005_Noncoding.fasta
Seems like the pipeline logic is a little off for merge. Btw, do we want to implement this logic in the pipeline?
https://crispy-invention-072327aa.pages.github.io/filter-fasta/#complex-filtering
I suppose we can add the complex filtering as a bucket list, I don't really see any urgent use for it yet!
The merge logic is more important for me right now XD
I want to use fasta entry and do both merge and split using the pipeline so I'm expecting the following files that are missing
Here's the current config
Command