ebrintn / Scholarly-Project

0 stars 0 forks source link

Requirements #1

Open ebrintn opened 10 months ago

ebrintn commented 10 months ago
ebrintn commented 10 months ago

SARS-CoV-2 dataset:

continent Asia Europe North America Oceania South America 60 1068 387 69 19 image

ebrintn commented 10 months ago

HIV-1 Dataset Los-Alamos Compendium HIV-1 sequences from 2021, all subtypes 198 sequences

These sequences are already aligned by the Los Alamos research centre

ebrintn commented 8 months ago

Getting Watterson's theta, Lamarck and Pi

ebrintn commented 3 months ago
ebrintn commented 3 months ago

Whole human genome sequence databases are hard to get. As a result I moved to BRCA1 genes from homo sapiens

ebrintn commented 3 months ago

For BRCA1 genome database: downloaded reference sequence - https://www.ncbi.nlm.nih.gov/nuccore/262359905 grabbed 100 similar sequences using blastn (not megablast because too similar)