With your updated documentation, it is much easier to repeat the example analysis using knock-knock. I am summarizing the bugs, issues, and concerns that I found when I explore knock-knock with the example data (Pacbio) and the latest documentation (commit: 0c347a33e24754e1d564348cf82b8ff1308bd92a). Hope it can save you some time when you update knock-knock.
Solution: the documentation does not mention that, for this dataset, we need to run the following line first to build indices for supplementary genome e_coli:
knock-knock build-indices PROJECT_DIR e_coli
The process --stage categorize command:
Command used:
knock-knock process PROJECT_DIR pacbio R_PCR --stage categorize
Error: 'PacbioExperiment' object has no attribute 'uncommon_read_type'
Solution: add self.uncommon_read_type = 'CCS' into "pacbio_experiment.py"
The process --stage generate_figure command:
Command used:
`knock-knock process PROJECT_DIR pacbio R_PCR --stage generate_figure
Error1: AttributeError: 'PacbioExperiment' object has no attribute
Solution: add self.preprocessed_read_type = 'CCS' to "pacbio_experiment.py"
Error2:AttributeError: 'TargetInfo' object has no attribute 'all_supplemental_reference_names'. Did you mean: 'supplemental_reference_sequences'?
Solution: add self.all_supplemental_reference_names = [] into "target_info.py"
Partial solution: add a filtering step in front, not sure if this is a legit solution or not, the code is: alignments = copy.deepcopy([al for al in alignments if al.reference_name in reference_order])
Error4: AttributeError: 'PacbioExperiment' object has no attribute 'batch":
Issue1: documentation command should be knock-knock table PROJECT_DIR not knock-knock table BASE_DIR
Concern1: while I am testing using the Pacbio dataset only, this command assumes that both Illumina and Pacbio results are in place. I need to remove thedata/illumina folder for this command to not complain.
Error1: missing "exp.batch"
Solution: change "exp.batch" to "exp.batch_name" in the "experiment.py" script
Error2: AttributeError: 'PacbioExperiment' object has no attribute 'experiment_group':
Hi Jeff,
With your updated documentation, it is much easier to repeat the example analysis using
knock-knock
. I am summarizing the bugs, issues, and concerns that I found when I exploreknock-knock
with the example data (Pacbio) and the latest documentation (commit: 0c347a33e24754e1d564348cf82b8ff1308bd92a). Hope it can save you some time when you updateknock-knock
.installation:
setup.py
requires "hits v0.4.1", which is not available from PyPI. https://github.com/jeffhussmann/knock-knock/blob/0c347a33e24754e1d564348cf82b8ff1308bd92a/setup.py#L53Install example dataset:
Build targets:
The
parallel
commandknock-knock process
command first. Encountered same issue, therefore, decide to test each--stage
separately.The
process --stage preprocess
command: works wellThe
process --stage align
command:The
process --stage categorize
command:self.uncommon_read_type = 'CCS'
into "pacbio_experiment.py"The
process --stage generate_figure
command:self.preprocessed_read_type = 'CCS'
to "pacbio_experiment.py"self.all_supplemental_reference_names = []
into "target_info.py"reference_name
that is not in thereference_order
object: https://github.com/jeffhussmann/knock-knock/blob/0c347a33e24754e1d564348cf82b8ff1308bd92a/knock_knock/visualize/architecture.py#L601alignments = copy.deepcopy([al for al in alignments if al.reference_name in reference_order])
The
knock-knock table
command:knock-knock table PROJECT_DIR
notknock-knock table BASE_DIR
data/illumina
folder for this command to not complain.After the above steps, I can successfully generate results using example dataset, which are saved here: https://www.dropbox.com/sh/21n95nh0quvom4i/AACjjtxzC3iXeXoFP-lIXPEra?dl=0, can you take a look there and let me know if the results look correct to you or not?
Thank you for looking into this, if you are willing to review, I am happy to create a
Pull request
with all the changes. Let me know,--Kai