artic-network / fieldbioinformatics

The ARTIC field bioinformatics pipeline
MIT License
110 stars 68 forks source link

adding an amplicon depth plot #33

Closed will-rowe closed 4 years ago

will-rowe commented 4 years ago

This PR adds amplicon depth plots to the output of the minion pipeline. Changes to make this happen:

nickloman commented 4 years ago

Tested, looks great, thanks!

nickloman commented 4 years ago

Couple of minor bits of feedback:

Psy-Fer commented 4 years ago

Hello,

we are getting this when I run with custom amplicons

Running: artic_plot_amplicon_depth --primerScheme /home/prom/SARS-CoV-2_GTG/protocols/Kirby/schemes/nCoV-2019/V1/nCoV-2019.scheme.bed --sampleID ADE-1 --outFilePrefix ADE-1 ADE-1*.depths
Traceback (most recent call last):
  File "/home/prom/miniconda3/envs/artic-ncov2019/bin/artic_plot_amplicon_depth", line 8, in <module>
    sys.exit(main())
  File "/home/prom/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/plot_amplicon_depth.py", line 143, in main
    go(args)
  File "/home/prom/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/plot_amplicon_depth.py", line 84, in go
    x=df['position'], bins=starts, labels=amplicons)
  File "/home/prom/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/pandas/core/reshape/tile.py", line 228, in cut
    raise ValueError('bins must increase monotonically.')
ValueError: bins must increase monotonically.
Command failed:artic_plot_amplicon_depth --primerScheme /home/prom/SARS-CoV-2_GTG/protocols/Kirby/schemes/nCoV-2019/V1/nCoV-2019.scheme.bed --sampleID ADE-1 --outFilePrefix ADE-1 ADE-1*.depths

Any thoughts?

will-rowe commented 4 years ago

Hey @Psy-Fer

I only really wrote the plotting script with the artic scheme in mind - it wasn't tested against anything else. Could you send me your primer scheme and I can see if I can work it out?

nickloman commented 4 years ago

It's very important for downstream analysis that the primer scheme BED file matches your specific amplicon scheme, rather than using our V1 scheme as seems to be the case here. The BED file specifies the start and end position and pool number of each amplicon for purposes of correct trimming.

Psy-Fer commented 4 years ago

Here is the scheme we are using

https://github.com/Psy-Fer/SARS-CoV-2_GTG/tree/master/protocols/Kirby/schemes/nCoV-2019/V1

For now, i've just commented out the line in the minion.py file

nickloman commented 4 years ago

that looks possibly wrong to me, nCoV-2019_1 / nCoV-2019_2 relate to the pool numbers. Judging by your coordinates amplicons 1-7 are pool 1 and 8-14 are pool 2. Might need to be sorted by start position too.

Psy-Fer commented 4 years ago

Are you looking at this file? https://github.com/Psy-Fer/SARS-CoV-2_GTG/blob/master/protocols/Kirby/schemes/nCoV-2019/V1/nCoV-2019.scheme.bed

Psy-Fer commented 4 years ago

MN908947.3  31  54  nCoV-2019_1_LEFT    nCoV-2019_1
MN908947.3  2569    2592    nCoV-2019_1_RIGHT   nCoV-2019_1
MN908947.3  4295    4321    nCoV-2019_2_LEFT    nCoV-2019_2
MN908947.3  6847    6873    nCoV-2019_2_RIGHT   nCoV-2019_2
MN908947.3  8596    8619    nCoV-2019_3_LEFT    nCoV-2019_1
MN908947.3  11049   11074   nCoV-2019_3_RIGHT   nCoV-2019_1
MN908947.3  12711   12732   nCoV-2019_4_LEFT    nCoV-2019_2
MN908947.3  15225   15246   nCoV-2019_4_RIGHT   nCoV-2019_2
MN908947.3  16847   16871   nCoV-2019_5_LEFT    nCoV-2019_1
MN908947.3  19254   19278   nCoV-2019_5_RIGHT   nCoV-2019_1
MN908947.3  21358   21386   nCoV-2019_6_LEFT    nCoV-2019_2
MN908947.3  23823   23847   nCoV-2019_6_RIGHT   nCoV-2019_2
MN908947.3  25602   25623   nCoV-2019_7_LEFT    nCoV-2019_1
MN908947.3  28146   28172   nCoV-2019_7_RIGHT   nCoV-2019_1
MN908947.3  1876    1897    nCoV-2019_8_LEFT    nCoV-2019_2
MN908947.3  4429    4450    nCoV-2019_8_RIGHT   nCoV-2019_2
MN908947.3  6287    6310    nCoV-2019_9_LEFT    nCoV-2019_1
MN908947.3  8828    8851    nCoV-2019_9_RIGHT   nCoV-2019_1
MN908947.3  10363   10384   nCoV-2019_10_LEFT   nCoV-2019_2
MN908947.3  12780   12802   nCoV-2019_10_RIGHT  nCoV-2019_2
MN908947.3  14546   14570   nCoV-2019_11_LEFT   nCoV-2019_1
MN908947.3  17131   17152   nCoV-2019_11_RIGHT  nCoV-2019_1
MN908947.3  18897   18918   nCoV-2019_12_LEFT   nCoV-2019_2
MN908947.3  21428   21455   nCoV-2019_12_RIGHT  nCoV-2019_2
MN908947.3  23123   23144   nCoV-2019_13_LEFT   nCoV-2019_1
MN908947.3  25647   25673   nCoV-2019_13_RIGHT  nCoV-2019_1
MN908947.3  27447   27471   nCoV-2019_14_LEFT   nCoV-2019_2
MN908947.3  29837   29866   nCoV-2019_14_RIGHT  nCoV-2019_2
will-rowe commented 4 years ago

Yes - I've recreated this now and I'm thinking it's that I've assumed primer scheme to be sorted by amplicon start

nickloman commented 4 years ago

Yes was looking at that file. Based on these coordinates amplicons 1-7 would be amplified in the first pool, and 8-14 would be in the second pool, but you have specified alternating pools for 1-14.

Psy-Fer commented 4 years ago

OOOOOOH I see your meaning.

Long day.

I shall fix this...