adbailey4 / yeast_rrna_modification_detection

MIT License
2 stars 0 forks source link

0 outputs when reproducing results #4

Closed AminaLEM closed 2 years ago

AminaLEM commented 2 years ago

Hello @adbailey4 ,

I get 0 outputs when I run :

runSignalAlign.py run --config `pwd`/yeast_rrna_modification_detection/testing/inference.config.json
[SignalAlignment.run] NOTICE: Creating forward and backward fasta files.
[SignalAlignment.run] NOTICE: Creating forward and backward fasta files.
[multithread_signal_alignment_samples] Running SignalAlign on sample: wild_type
[SignalAlignment.run] NOTICE: Creating forward and backward fasta files.
[SignalAlignment.run] NOTICE: Creating forward and backward fasta files.
[multithread_signal_alignment_samples] Running SignalAlign on sample: wild_type
[SignalAlignment.run] NOTICE: Creating forward and backward fasta files.
[SignalAlignment.run] NOTICE: Creating forward and backward fasta files.
[multithread_signal_alignment_samples] Running SignalAlign on sample: wild_type
[multithread_signal_alignment_samples] wild_type generated 0 output_files
[multithread_signal_alignment_samples] Running SignalAlign on sample: ivt
[multithread_signal_alignment_samples] Running SignalAlign on sample: ivt
[multithread_signal_alignment_samples] Running SignalAlign on sample: ivt
[multithread_signal_alignment_samples] ivt generated 0 output_files

#  signalAlign - finished alignments

[signalAlign] Complete
Running Time = 29.307354552205652 seconds

#  signalAlign - finished alignments

I used your config file inference.config.json including the model testing/yeast_rrna_ivt_wt_trained_071521.model. I would appreciate if you could help me with this matter and share the trained model used in your paper.

Regards, Amina

adbailey4 commented 2 years ago

Did you downloaded all the data referenced in the repo? If so, can you share the config file you used with the updated paths to the data you are trying to analyze?

Basically, it looks like signalAlign didn't find any data to process, so it didn't do anything.

AminaLEM commented 2 years ago

HI @adbailey4 ,

Thank you for your response. Actually, the problem could be the trained model... Can you point me to your trained model ? and whether you used one trained model for all result inference or you trained one for WT vs IVT and another one for WT vs KO as in the inference_pipeline.py file the model is called yeast_rrna_depletion_trained_040721.model and in the config file of the testing example it is called yeast_rrna_ivt_wt_trained_071521.model.

Thanks Amina

AminaLEM commented 2 years ago

This is an example of the config file generated by inference_pipeline.py (I changed models in inference_pipeline.py according to the local paths):

{
  "signal_alignment_args": {
    "target_regions": null,
    "track_memory_usage": false,
    "threshold": 0.1,
    "event_table": null,
    "embed": false,
    "delete_tmp": true,
    "output_format": "full"
  },
  "samples": [
    {
      "positions_file": "/prj/Amina/yeast/yeast_rrna_modification_detection/training/small_5mer/yeast_18S_25S_variants.positions",
      "fast5_dirs": [
        "/home/alemsara/inference/20190610_R941_CBF5GAL/20190610_2059_MN20528_FAK94344_20865eb5/signalalign_output/split_fast5s"
      ],
      "bwa_reference": "/prj/Amina/yeast/yeast_rrna_modification_detection/training/reference/yeast_25S_18S.fa",
      "fofns": [],
      "readdb": "/prj/Amina/yeast/yeast_rrna_modification_detection/end_to_end/fastq/20190610_R941_CBF5GAL/20190610_2059_MN20528_FAK94344_20865eb5/20190610_2059_MN20528_FAK94344_20865eb5.10000.fastq.index.readdb",
      "fw_reference": null,
      "bw_reference": null,
      "kmers_from_reference": false,
      "motifs": null,
      "name": "20190610_R941_CBF5GAL",
      "probability_threshold": 0.7,
      "number_of_kmer_assignments": 10000,
      "alignment_file": "/prj/Amina/yeast/yeast_rrna_modification_detection/end_to_end/fastq/20190610_R941_CBF5GAL/20190610_2059_MN20528_FAK94344_20865eb5/20190610_2059_MN20528_FAK94344_20865eb5.10000.2308.sorted.bam",
      "recursive": false,
      "assignments_dir": null
    }
  ],
  "path_to_bin": "/root/src/signalAlign/bin",
  "complement_hdp_model": null,
  "template_hdp_model": null,
  "complement_hmm_model": null,
  "template_hmm_model": "/prj/Amina/yeast/yeast_rrna_modification_detection/testing/yeast_rrna_ivt_wt_trained_071521.model",
  "job_count": 96,
  "debug": false,
  "two_d": false,
  "output_dir": "/home/alemsara/inference/20190610_R941_CBF5GAL/20190610_2059_MN20528_FAK94344_20865eb5/signalalign_output",
  "constraint_trim": null,
  "diagonal_expansion": null,
  "traceBackDiagonals": 150,
  "filter_reads": 0,
  "perform_kmer_event_alignment": true,
  "overwrite": true,
  "rna": true,
  "ambig_model": "/prj/Amina/yeast/yeast_rrna_modification_detection/training/small_5mer/small_variants.model",
  "built_alignments": null,
  "delete_alignments": false
}
adbailey4 commented 2 years ago

I think there are two things here.

1) great catch on the model inconsistencies! I forgot to update the inference_pipeline.py with the new model. If you go to https://github.com/adbailey4/yeast_rrna_modification_detection/blob/main/testing/testing.md, you can see what commands I used to generate data for our paper. I will make an edit and update versions for the inference_pipeline.py.

2) I still think the reason you are not getting any output is a data path specification issue. If the model was not specified or incorrectly specified, there would be an error.

Can you check out the file paths specified in "/prj/Amina/yeast/yeast_rrna_modification_detection/end_to_end/fastq/20190610_R941_CBF5GAL/20190610_2059_MN20528_FAK94344_20865eb5/20190610_2059_MN20528_FAK94344_20865eb5.10000.fastq.index.readdb"? Make sure the complete path for each file exists? Usually, if there are no outputs or files processed but everything is specified correctly, it is a problem with the readdb.

Also, another option is to run in debug mode. Set "debug": true and see what errors crop up. It's going to be quite a bit more verbose and slower but it usually is more helpful for debugging (obviously).