Final Project: Assessing the Impact of Harmane Treatment on Horse Dung Microbiomes

Project Overview

This project aims to explore the effects of harmane, a compound found in the mycelium of psilocybin-producing fungi, on the microbial communities in horse dung. Utilizing 16S rRNA metabarcoding, this study compares the microbiomes of harmane-treated versus untreated samples across three time points to discern the compound's impact on microbial diversity and composition.

Data

Dataset Description

The dataset comprises demultiplexed .fastq files from an Oxford Nanopore MinION 16S barcoding sequencing run using the 16S Barcoding Kit 1-24 (SQK-16S024), totaling 24 samples across four categories:

Harmane-Treated Horse Dung:
- 9 samples across three collection days.
- Days 0, 4, 14: Barcodes 1-3, 4-6, 7-9 respectively (biological replicates).
Non-Harmane-Treated Horse Dung:
- 9 control samples collected parallel to the treated group.
- Days 0, 4, 14: Barcodes 10-12, 13-15, 16-18 respectively (biological replicates).
Negative Control:
- 3 samples (barcodes 19-21, technical replicates) to check for contamination and sequencing biases.
Positive Control:
- 3 samples (barcodes 22-24, technical replicates) of a known artificial microbiome to verify sequencing and analytical accuracy.

Each sample file is stored in its respective directory labeled by barcode within the Demultiplexed directory, such as barcode01, barcode02, etc.

Pre-processing

The raw sequencing data underwent several pre-processing steps:

Basecalling: Sequencing data was initially basecalled using Guppy, the software provided by Oxford Nanopore Technologies.
Demultiplexing: The basecalled reads were then demultiplexed using Guppy barcoder to assign reads to their respective barcode based on the 16S Barcoding Kit used during sample preparation.
File Organization: Post-demultiplexing, the reads were organized into respective folders for each barcode. These folders contain multiple .fastq files, representing sequencing runs.

The raw data is extensive and therefore not included in this repository due to size constraints but is located at /fs/ess/PAS2700/users/meyer1556/Final_Project/raw_data/Demultiplexed on the Ohio Supercomputer Center (OSC) storage systems.

Project Execution and Troubleshooting

Current Status

The project is currently incomplete due to unresolved technical issues primarily involving dependencies and configurations within the QIIME2 environment. Specific problems include errors during the importDb process where QIIME2 fails to import necessary data due to caching issues in the numba and umap Python modules.

Steps for Running the Project

To attempt running the project:

Data Preparation: Copy the raw data from the provided OSC path into your directory.
Environment Setup: Configure the necessary computational environments and dependencies as detailed in the submission notes and the main script.
Pipeline Execution: Run the provided pipeline script, keeping in mind that due to unresolved issues, successful completion is not guaranteed.

Additional Notes

Archive and Temporary Files: Please disregard files in any archive subdirectories and any temporary files generated during troubleshooting unless required for detailed error analysis.
Dependency and Configuration Issues: Numerous adjustments to environment variables and directory permissions were required, which are detailed in the submission notes. These reflect the project's ongoing troubleshooting efforts.

Final Remarks

Despite the challenges faced, this project offers a valuable learning opportunity in the complexities of bioinformatics workflows. Future efforts to resolve the current issues will be crucial for its completion. Your understanding and patience are appreciated.

Thank you for reviewing this project.

meyer-1556 / PLNTPTH6193-Final_Project

readme