salmon alevin expectCells, forceCells and whitelist conflicting?

crazyhottommy commented 5 years ago

Hi,

I am using Alevin to process some 10x v2 data.

salmon -v
salmon 0.12.0

when not providing whitelist the mapping rate is low 20%, and many reads are discarded because of the noisy barcodes.

not providing whitelist, forceCells 3000

[2019-01-29 15:49:33.693] [alevinLog] [info] Done barcode density calculation.
[2019-01-29 15:49:33.693] [alevinLog] [info] # Barcodes Used: 287883370 / 287983348.
[2019-01-29 15:49:35.112] [alevinLog] [info] Total 49.4292% reads will be thrown away because of noisy Cellular barcodes.
[2019-01-29 15:49:35.112] [alevinLog] [info] Total 4000(has 1000 low confidence) barcodes
[2019-01-29 15:49:35.251] [alevinLog] [info] Done True Barcode Sampling
[2019-01-29 15:49:35.896] [alevinLog] [info] Done populating Z matrix
[2019-01-29 15:49:35.994] [alevinLog] [info] Done indexing Barcodes
[2019-01-29 15:49:35.994] [alevinLog] [info] Total Unique barcodes found: 3214859
[2019-01-29 15:49:35.994] [alevinLog] [info] Used Barcodes except Whitelist: 105901
[2019-01-29 15:49:36.107] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify

[2019-01-29 15:49:36.107] [alevinLog] [info] parsing read library format
[2019-01-29 15:49:36.415] [stderrLog] [info] Loading Suffix Array
[2019-01-29 15:49:36.107] [jointLog] [info] There is 1 library.
[2019-01-29 15:49:36.373] [jointLog] [info] Loading Quasi index
[2019-01-29 15:49:36.396] [jointLog] [info] Loading 32-bit quasi index
[2019-01-29 15:49:50.479] [stderrLog] [info] Loading Transcript Info
[2019-01-29 15:49:55.550] [stderrLog] [info] Loading Rank-Select Bit Array
[2019-01-29 15:49:56.036] [stderrLog] [info] There were 58,086 set bits in the bit array
[2019-01-29 15:49:56.077] [stderrLog] [info] Computing transcript lengths
[2019-01-29 15:49:56.077] [stderrLog] [info] Waiting to finish loading hash
[2019-01-29 15:49:56.077] [stderrLog] [info] Done loading index
[2019-01-29 15:49:56.077] [jointLog] [info] done
[2019-01-29 15:49:56.077] [jointLog] [info] Index contained 58,086 targets
processed 287 Million fragments
hits: 152080339, hits per frag:  0.529087

[2019-01-29 16:05:46.677] [jointLog] [info] Computed 58,128 rich equivalence classes for further processing
[2019-01-29 16:05:46.677] [jointLog] [info] Counted 58,047,553 total reads in the equivalence classes
[2019-01-29 16:05:46.678] [jointLog] [warning] Found 21006 reads with `N` in the UMI sequence and ignored the reads.
Please report on github if this number is too large
[2019-01-29 16:05:46.678] [jointLog] [info] Mapping rate = 20.1566%

[2019-01-29 16:05:46.678] [jointLog] [info] finished quantifyLibrary()
[2019-01-29 16:05:51.316] [alevinLog] [info] Starting optimizer

**Analyzed 4000 cells (100% of all).**
[2019-01-29 16:06:27.447] [alevinLog] [info] Total 20167967.00 UMI after deduplicating.
[2019-01-29 16:06:27.475] [alevinLog] [info] Clearing EqMap; Might take some time.
[2019-01-29 16:06:39.341] [alevinLog] [info] Starting Import of the gene count matrix of size 4000x29574.
[2019-01-29 16:06:40.089] [alevinLog] [info] Done initializing the empty matrix.
[2019-01-29 16:06:45.902] [alevinLog] [info] Done Importing gene count matrix for dimension 4000x29574
0.00248159      1.3538e-06      0.000205983     221334
0.00714905      1.09739e-06     0.000617623     71036.5
[2019-01-29 16:06:46.603] [alevinLog] [info] Starting white listing
[2019-01-29 16:06:46.615] [alevinLog] [info] Done importing order of barcodes "quants_mat_rows.txt" file.
[2019-01-29 16:06:46.615] [alevinLog] [info] Total 4000 barcodes found
[2019-01-29 16:06:46.615] [alevinLog] [warning] mrna file not provided; using is 1 less feature for whitelisting
[2019-01-29 16:06:46.615] [alevinLog] [warning] rrna file not provided; using is 1 less feature for whitelisting
[2019-01-29 16:06:46.615] [alevinLog] [info] Starting to make feature Matrix
[2019-01-29 16:06:46.790] [alevinLog] [info] Done making regular featues
[2019-01-29 16:06:46.790] [alevinLog] [info] Done making feature Matrix
[2019-01-29 16:06:46.871] [alevinLog] [info] Finished white listing
[2019-01-29 16:06:46.957] [alevinLog] [info] Finished optimizer

no whitelist, expectCells 3000

[2019-01-29 15:45:58.012] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65
[2019-01-29 15:45:58.012] [jointLog] [info] Usage of --validateMappings implies use of range factorization. rangeFactorizationBins is being set to 4
[2019-01-29 15:45:58.012] [jointLog] [info] Usage of --validateMappings implies a default consensus slack of 1. Setting consensusSlack to 1.
[2019-01-29 15:45:58.012] [jointLog] [info] Using default value of 0.8 for minScoreFraction in Alevin
[2019-01-29 15:45:58.021] [alevinLog] [info] Processing barcodes files (if Present)

processed 287 Million barcodes

[2019-01-29 15:51:37.144] [alevinLog] [info] Done barcode density calculation.
[2019-01-29 15:51:37.144] [alevinLog] [info] # Barcodes Used: 287883370 / 287983348.
[2019-01-29 15:51:38.549] [alevinLog] [error] Can't find right Boundary.
Please Report this issue on github.

whitelist forceCells 3000

The mapping rate was boost to 37%, but now the forceCells and expectCells seems not work, alevin still processed over 400,000 cells.

cat logs/salmon_quant.log
[2019-01-28 21:02:08.487] [jointLog] [info] Fragment incompatibility prior below threshold.  Incompatible fragments will be ignored.
[2019-01-28 21:02:08.487] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65
[2019-01-28 21:02:08.487] [jointLog] [info] Usage of --validateMappings implies use of range factorization. rangeFactorizationBins is being set to 4
[2019-01-28 21:02:08.487] [jointLog] [info] Usage of --validateMappings implies a default consensus slack of 1. Setting consensusSlack to 1.
[2019-01-28 21:02:08.487] [jointLog] [info] Using default value of 0.8 for minScoreFraction in Alevin
[2019-01-28 21:09:02.560] [jointLog] [info] There is 1 library.
[2019-01-28 21:09:04.049] [jointLog] [info] Loading Quasi index
[2019-01-28 21:09:04.286] [jointLog] [info] Loading 32-bit quasi index
[2019-01-28 21:09:43.870] [jointLog] [info] done
[2019-01-28 21:09:43.870] [jointLog] [info] Index contained 58,086 targets
[2019-01-28 21:40:30.445] [jointLog] [info] Computed 64,654 rich equivalence classes for further processing
[2019-01-28 21:40:30.445] [jointLog] [info] Counted 107,408,832 total reads in the equivalence classes
[2019-01-28 21:40:30.446] [jointLog] [warning] Found 39484 reads with `N` in the UMI sequence and ignored the reads.
Please report on github if this number is too large
[2019-01-28 21:40:30.446] [jointLog] [info] Mapping rate = 37.2969%

[2019-01-28 21:40:30.446] [jointLog] [info] finished quantifyLibrary()

cat alevin/alevin.log
[2019-01-28 21:02:08.496] [alevinLog] [info] Processing barcodes files (if Present)

[2019-01-28 21:07:56.106] [alevinLog] [info] Done barcode density calculation.
[2019-01-28 21:07:56.106] [alevinLog] [info] # Barcodes Used: 287883370 / 287983348.
[2019-01-28 21:07:57.808] [alevinLog] [info] Done importing white-list Barcodes
[2019-01-28 21:07:58.273] [alevinLog] [warning] Skipping 290359 Barcodes with 0 reads
 Assuming this is the required behavior.
[2019-01-28 21:07:58.517] [alevinLog] [info] Total 446921 white-listed Barcodes
[2019-01-28 21:09:02.029] [alevinLog] [info] Done populating Z matrix
[2019-01-28 21:09:02.331] [alevinLog] [info] Done indexing Barcodes
[2019-01-28 21:09:02.331] [alevinLog] [info] Total Unique barcodes found: 3214859
[2019-01-28 21:09:02.331] [alevinLog] [info] Used Barcodes except Whitelist: 168781
[2019-01-28 21:09:02.559] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify

[2019-01-28 21:09:02.559] [alevinLog] [info] parsing read library format
[2019-01-28 21:40:44.514] [alevinLog] [info] Starting optimizer

[2019-01-28 22:05:50.958] [alevinLog] [info] Total 39006157.00 UMI after deduplicating.
[2019-01-28 22:05:50.960] [alevinLog] [warning] Skipped 160108 barcodes due to No mapped read
[2019-01-28 22:11:23.270] [alevinLog] [info] Finished optimizer

[2019-01-28 21:40:30.445] [jointLog] [info] Computed 64,654 rich equivalence classes for further processing
[2019-01-28 21:40:30.445] [jointLog] [info] Counted 107,408,832 total reads in the equivalence classes
[2019-01-28 21:40:30.446] [jointLog] [warning] Found 39484 reads with `N` in the UMI sequence and ignored the reads.
Please report on github if this number is too large
[2019-01-28 21:40:30.446] [jointLog] [info] Mapping rate = 37.2969%

[2019-01-28 21:40:30.446] [jointLog] [info] finished quantifyLibrary()
[2019-01-28 21:40:44.514] [alevinLog] [info] Starting optimizer

Analyzed 446921 cells (100% of all).
[2019-01-28 22:05:50.958] [alevinLog] [info] Total 39006157.00 UMI after deduplicating.
[2019-01-28 22:05:50.960] [alevinLog] [warning] Skipped 160108 barcodes due to No mapped read
[2019-01-28 22:11:23.270] [alevinLog] [info] Finished optimizer

crazyhottommy commented 5 years ago

not sure what's the best way to share the 34G fastqs for your test.

k3yavi commented 5 years ago

I think I see the issue, and we might not actually need the data. The issue being the list of CBs in whitelist.txt is too many. It seems the data is very noisy and knee finding algorithm is failing. I'd suggest to use --dumpFeatures flag with alevin (you can also do --noQuant to stop alevin before mapping). This flag will generate raw_cb_frequency.txt file i.e. the frequency of all the observed CB and making the histogram will help visually find if there is possible knee in the data, generally it is in a real experiment.

re: the mapping rate, in the alevin log you'll observe there were 50% of the CB which were thrown away if you use forceCells 3000 w/o whitelist, basically alevin is taking top 3000 cells into consideration and throwing everything away. While the rise in mapping rate when externally provided with whitelist is actually by allowing more CB to go through. The two options forceCells and whitelist wan't intended to be use simultaneously because if provided with external whitelist alevin assumes the user is confident with the set of CBs, in your case this list is huge (~700k). Are you using the 10x whitelist from their website? If yes, then it's not what alevin expect with --whitelist option.

One work around here would be to look at the CB frequency histogram and making a file with only the CB sequences which you thing are above knee. Providing that file as the --whitelist is the intended use case for alevin.

Hope this makes sense.

crazyhottommy commented 5 years ago

yes, I am using the 10x whitelist downloaded from their website. Thanks for clarifying it. so should I not use --whitelist at all and let alevin determine what BCs to use? 50% of reads throwing away seems to be too much. what percentage do you observe? Thanks! without whitelist and use --expectCells 3000 gives me error] Can't find right Boundary. I should do --dumpFeatures and --noQuant to get the whitelist first?

k3yavi commented 5 years ago

Hi @crazyhottommy ,

Reading a bit in https://github.com/COMBINE-lab/salmon/issues/245 and https://github.com/COMBINE-lab/salmon/issues/284 would definitely help understand more about the pipeline.

summary: it's possible the 50% mapping read is due to alevin throwing away CB. In this case, we normally suggest to try running alevin in the following order: 1.) --expectCells X: alevin will look for local knee threshold near to top X barcodes. Alevin can still here based on the CB frequency distribution. 2.) --forceCells X: run alevin with --noQuant --dumpFeature mode and extract the frequency histogram of the CBs present in the histogram. Try to manually found knee in the descending sorted CB frequency histogram and figure out X. alevin will use top X barcodes as specified by the user. 3.) If there is other tools like cellranger already been run on the data, then alevin can directly consume their predicted CB sequence using --whitelist option. This is different file, not the 727k file from 10x, and is different for each experiment .

crazyhottommy commented 5 years ago

Thanks for the links. closing now.

matthew-pumc commented 3 years ago

Hi @k3yavi ,

I'm using alevin to process 10X V3 data and encountered similar problem with this issue.

I've tried run the pipeline using default whitelisting by alevin, --whitelist barcode.txt which from cellranger v.3.1.0 run (including 7938 barcodes), --expectCells 10000, and --expectCells 30000.

But no matter how I change the parameter, the log shows that there are always about 50% percent reads has been thrown away, and the mapping rate was between 18.7%-19.1%.

the salmon version is salmon 1.4.0 the reference genome is sequenced by ourselves, and it's a plant. my reads layout is paired end 150bp,

R1: @A00582:424:HJYLGDSXY:3:1101:1090:1000 1:N:0:ACCGGCTC TAACCAGGTCGAGTGAGTATTTAAGGCGCGCGGCGCACCAACGCACTCCCAACAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA + FFFFFFFFFFFFFFFFFFFFFFFFFFFF,,:,,FF:,,,::FF,,:F,,,,,,F:,,,:,::FF::::::,FFF:F:FF:FFFFFFF::FF::FF,F:F:FF:F,FFFF,:FF,FFFFF:,FF:::FF:FFF:FF:FF:FFFFFFFFFF: R2: @A00582:424:HJYLGDSXY:3:1101:1090:1000 2:N:0:ACCGGCTC NCCTAGAAGCAGCCACCCTTGAAAGAGTGCGTAATAGCTCACTGATCGAGCGCTCTTGCGCCGAAGATGAACGGGGCTAAGCGATCTGCCGAAGCTGTGGGATGTAAAAATACATCGGTAGGGGAGCGTTCCGCCTTAGAGAGAAGCCTC +

FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFF::FFFFFFFFFF:FFFFFFFFFFFFF:FFFFFF:FFFFFFF:

Here is the logs

Default setting

salmon alevin -l ISR -1 ../clean/sample_S1_L001_R1_001.fastq -2 ../clean/sample_S1_L001_R2_001.fastq --chromiumV3 -i ../../dna/00.ref_genome/sample/salmon_index_allmRNA -p 40 -o test_alevin_allmRNA --tgMap ../../dna/00.ref_genome/sample/alltxp2gene.tsv

[2021-01-25 16:22:09.565] [alevinLog] [info] Found 43030 transcripts(+0 decoys, +0 short and +0 duplicate names in the index) [2021-01-25 16:22:09.615] [alevinLog] [info] Filled with 43030 txp to gene entries [2021-01-25 16:22:09.620] [alevinLog] [info] Found all transcripts to gene mappings [2021-01-25 16:22:09.631] [alevinLog] [info] Processing barcodes files (if Present) [2021-01-25 16:26:35.067] [alevinLog] [info] Done barcode density calculation. [2021-01-25 16:26:35.067] [alevinLog] [info] # Barcodes Used: 188934609 / 188934609. [2021-01-25 16:26:42.979] [alevinLog] [info] Knee found left boundary at 21 [2021-01-25 16:27:05.707] [alevinLog] [warning] Gauss Prediction 4969 Too far from knee prediction skipping it [2021-01-25 16:27:05.707] [alevinLog] [info] Learned InvCov: 556.394 normfactor: 9159.58 [2021-01-25 16:27:05.707] [alevinLog] [info] Total 222(has 201 low confidence) barcodes [2021-01-25 16:27:06.573] [alevinLog] [info] Done True Barcode Sampling [2021-01-25 16:27:07.383] [alevinLog] [warning] Total 96.7029% reads will be thrown away because of noisy Cellular barcodes. [2021-01-25 16:27:07.412] [alevinLog] [info] Done populating Z matrix [2021-01-25 16:27:07.414] [alevinLog] [info] Total 3667 CB got sequence corrected [2021-01-25 16:27:07.414] [alevinLog] [info] Done indexing Barcodes [2021-01-25 16:27:07.414] [alevinLog] [info] Total Unique barcodes found: 3896665 [2021-01-25 16:27:07.414] [alevinLog] [info] Used Barcodes except Whitelist: 3667 [2021-01-25 16:27:07.498] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify [2021-01-25 16:27:07.498] [alevinLog] [info] parsing read library format [2021-01-25 16:30:54.542] [alevinLog] [info] Starting optimizer [2021-01-25 16:30:54.782] [alevinLog] [warning] mrna file not provided; using is 1 less feature for whitelisting [2021-01-25 16:30:54.782] [alevinLog] [warning] rrna file not provided; using is 1 less feature for whitelisting [2021-01-25 16:30:55.950] [alevinLog] [info] Total 1350278.00 UMI after deduplicating. [2021-01-25 16:30:55.950] [alevinLog] [info] Total 30909 BiDirected Edges. [2021-01-25 16:30:55.950] [alevinLog] [info] Total 8817 UniDirected Edges. [2021-01-25 16:30:55.969] [alevinLog] [info] Clearing EqMap; Might take some time. [2021-01-25 16:30:56.294] [alevinLog] [warning] Num High confidence barcodes too less 20 < 90.Can't performing whitelisting; Skipping [2021-01-25 16:30:56.297] [alevinLog] [info] Finished optimizer

with `--exceptCells 7000`

[2021-01-21 09:24:45.891] [alevinLog] [info] Found 43030 transcripts(+0 decoys, +0 short and +0 duplicate names in the index) [2021-01-21 09:24:45.942] [alevinLog] [info] Filled with 43030 txp to gene entries [2021-01-21 09:24:45.947] [alevinLog] [info] Found all transcripts to gene mappings [2021-01-21 09:24:45.967] [alevinLog] [info] Processing barcodes files (if Present) [2021-01-21 09:33:35.885] [alevinLog] [info] Done barcode density calculation. [2021-01-21 09:33:35.885] [alevinLog] [info] # Barcodes Used: 188934609 / 188934609. [2021-01-21 09:33:37.337] [alevinLog] [info] Total 10016(has 1000 low confidence) barcodes [2021-01-21 09:33:38.202] [alevinLog] [info] Done True Barcode Sampling [2021-01-21 09:33:39.137] [alevinLog] [warning] Total 52.0343% reads will be thrown away because of noisy Cellular barcodes. [2021-01-21 09:33:39.960] [alevinLog] [info] Done populating Z matrix [2021-01-21 09:33:39.989] [alevinLog] [info] Total 34923 CB got sequence corrected [2021-01-21 09:33:39.994] [alevinLog] [info] Done indexing Barcodes [2021-01-21 09:33:39.994] [alevinLog] [info] Total Unique barcodes found: 3896665 [2021-01-21 09:33:39.994] [alevinLog] [info] Used Barcodes except Whitelist: 34503 [2021-01-21 09:33:40.718] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify [2021-01-21 09:33:40.718] [alevinLog] [info] parsing read library format [2021-01-21 09:48:11.430] [alevinLog] [info] Starting optimizer [2021-01-21 09:48:12.160] [alevinLog] [warning] mrna file not provided; using is 1 less feature for whitelisting [2021-01-21 09:48:12.160] [alevinLog] [warning] rrna file not provided; using is 1 less feature for whitelisting [2021-01-21 09:48:36.288] [alevinLog] [info] Total 19031525.00 UMI after deduplicating. [2021-01-21 09:48:36.288] [alevinLog] [info] Total 454402 BiDirected Edges. [2021-01-21 09:48:36.288] [alevinLog] [info] Total 113688 UniDirected Edges. [2021-01-21 09:48:36.288] [alevinLog] [warning] Skipped 44 barcodes due to No mapped read [2021-01-21 09:48:36.307] [alevinLog] [info] Clearing EqMap; Might take some time. [2021-01-21 09:48:41.314] [alevinLog] [info] Starting white listing of 9971 cells [2021-01-21 09:48:41.314] [alevinLog] [info] Starting to make feature Matrix [2021-01-21 09:48:41.337] [alevinLog] [info] Done making feature Matrix [2021-01-21 09:48:41.557] [alevinLog] [info] Finished white listing [2021-01-21 09:48:41.580] [alevinLog] [info] Finished optimizer

{ "total_reads": 188934609, "reads_with_N": 0, "noisy_cb_reads": 98310747, "noisy_umi_reads": 16600, "used_reads": 90607262, "mapping_rate": 18.89108045842464, "reads_in_eqclasses": 35691789, "total_cbs": 3896665, "used_cbs": 44518, "initial_whitelist": 9015, "low_conf_cbs": 1000, "num_features": 5, "no_read_mapping_cbs": 44, "final_num_cbs": 6765, "deduplicated_umis": 19031525, "mean_umis_per_cell": 2813, "mean_genes_per_cell": 1315 }

My best result with `--exceptCells 30000`

... [2021-01-23 11:07:52.910] [alevinLog] [info] Done barcode density calculation. [2021-01-23 11:07:52.910] [alevinLog] [info] # Barcodes Used: 188934609 / 188934609. [2021-01-23 11:07:54.387] [alevinLog] [info] Total 12507(has 995 low confidence) barcodes [2021-01-23 11:07:55.251] [alevinLog] [info] Done True Barcode Sampling [2021-01-23 11:07:56.200] [alevinLog] [info] Total 49.0191% reads will be thrown away because of noisy Cellular barcodes. [2021-01-23 11:07:57.144] [alevinLog] [info] Done populating Z matrix [2021-01-23 11:07:57.172] [alevinLog] [info] Total 35787 CB got sequence corrected [2021-01-23 11:07:57.177] [alevinLog] [info] Done indexing Barcodes [2021-01-23 11:07:57.177] [alevinLog] [info] Total Unique barcodes found: 3896665 [2021-01-23 11:07:57.177] [alevinLog] [info] Used Barcodes except Whitelist: 35219 [2021-01-23 11:07:57.360] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify ....

{ "total_reads": 188934609, "reads_with_N": 0, "noisy_cb_reads": 92614076, "noisy_umi_reads": 17028, "used_reads": 96303505, "mapping_rate": 19.451325087824434, "reads_in_eqclasses": 36750285, "total_cbs": 3896665, "used_cbs": 47725, "initial_whitelist": 11511, "low_conf_cbs": 995, "num_features": 5, "no_read_mapping_cbs": 70, "final_num_cbs": 8324, "deduplicated_umis": 19613485, "mean_umis_per_cell": 2356, "mean_genes_per_cell": 1120 }

I'm wondering that

how can I use as much reads as possible and improve the mapping rate.
will the 150bp reads R1 affect the pipeline, and if it will, how can I make it to 28bp.

By the way, the cellranger result shows that reads map to Transcriptome is low, but reads mapped to Genome is 85%

Reads Mapped to Genome	85.2%
Reads Mapped Confidently to Genome	45.8%
Reads Mapped Confidently to Intergenic Regions	11.0%
Reads Mapped Confidently to Intronic Regions	4.2%
Reads Mapped Confidently to Exonic Regions	30.6%
Reads Mapped Confidently to Transcriptome	25.3%
Reads Mapped Antisense to Gene	0.9%

Estimated Number of Cells	7,938
Fraction Reads in Cells	73.1%
Mean Reads per Cell	23,801
Median Genes per Cell	1,076
Total Genes Detected	17,492
Median UMI Counts per Cell	2,155

Best wishes, Matthew

COMBINE-lab / salmon