microsoft / ProphetNet

A research project for natural language generation, containing the official implementations by MSRA NLC team.
MIT License
649 stars 104 forks source link

Truncated source text during inference #8

Open prithvijaunjale opened 4 years ago

prithvijaunjale commented 4 years ago

I finetuned the pretrained ProphetNet model for 1 epoch on my own dataset on Google Colab for a summarization task. For inference I used:

!fairseq-interactive processed \
--path $CHECK_POINT \
--user-dir prophetnet \
--max-source-positions 6000 --max-target-positions 512 \
--task translation_prophetnet \

Output:

Namespace(beam=5, bpe=None, buffer_size=1, cpu=False, criterion='cross_entropy', data='processed', dataset_impl=None, decoding_format=None, diverse_beam_groups=-1, diverse_beam_strength=0.5, empty_cache_freq=0, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', input='-', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, lazy_load=False, left_pad_source='True', left_pad_target='False', lenpen=1, load_alignments=False, log_format=None, log_interval=1000, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=1, max_source_positions=6000, max_target_positions=512, max_tokens=None, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', momentum=0.99, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=False, no_repeat_ngram_size=0, num_shards=1, num_workers=1, optimizer='nag', path='drive/My Drive/deep_learning/nlp/covid19/prophetNet/finetune_checkpoints/checkpoint1.pt', prefix_size=0, print_alignment=False, print_step=False, quiet=False, raw_text=False, remove_bpe=None, replace_unk=None, required_batch_size_multiple=8, results_path=None, retain_iter_history=False, sacrebleu=False, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang=None, target_lang=None, task='translation_prophetnet', temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, truncate_source=False, unkpen=0, unnormalized=False, upsample_primary=1, user_dir='prophetnet', warmup_updates=0, weight_decay=0.0)
| [src] dictionary: 30522 types
| [tgt] dictionary: 30522 types
| loading model(s) from drive/My Drive/deep_learning/nlp/covid19/prophetNet/finetune_checkpoints/checkpoint1.pt
tcmalloc: large alloc 1587781632 bytes == 0x34748000 @  0x7fe0e3b01b6b 0x7fe0e3b21379 0x7fe098a7f04e 0x7fe098a80f4a 0x7fe0d196e0c4 0x7fe0dfbbb5d9 0x551b15 0x5aa6ec 0x50abb3 0x50c5b9 0x508245 0x509642 0x5a55a1 0x5a58f8 0x4e07ee 0x50abe1 0x50c5b9 0x508245 0x58958c 0x5a067e 0x50d966 0x508245 0x50a080 0x50aa7d 0x50d390 0x508245 0x50a080 0x50aa7d 0x50c5b9 0x508245 0x50a080
tcmalloc: large alloc 1587781632 bytes == 0x93182000 @  0x7fe0e3b01b6b 0x7fe0e3b21379 0x7fe098a7f04e 0x7fe098a80f4a 0x7fe0d196e0c4 0x7fe0dfbbb5d9 0x551b15 0x5aa6ec 0x50abb3 0x50c5b9 0x508245 0x509642 0x5a55a1 0x5a58f8 0x4e07ee 0x50abe1 0x50c5b9 0x508245 0x58958c 0x5a067e 0x50d966 0x508245 0x50a080 0x50aa7d 0x50d390 0x508245 0x50a080 0x50aa7d 0x50c5b9 0x508245 0x50a080
| Type the input sentence and press return:
ion channels are integral membrane proteins involved in specialized physiological functions demanding a precise control of the membrane permeability as regards the exchange of water molecules, ions and even small solutes (metabolites and antibiotics) [1] [2] [3] . the modulation of channel current occurs in response to a diversity of cellular signals including changes in voltage across the cell membrane (voltage-gated ion channels), chemical stimulus (ligand-gated ion channels, phosphorylation), changes in temperature, mechanical deformation and interaction with other molecules in the cell. the physiological significance of some of these mechanisms reported in vitro has been questioned because they require extreme conditions hard to meet in vivo (unrealistic high voltages, non-physiological concentrations, etc.) [4] . accordingly, many studies have focused on the role of solution acidity [5, 6] , an elementary factor that crucially regulates ion channel activity, extensively studied both in vivo and in vitro [5] [6] [7] [8] . relevant examples of pore function modulation by ph include potassium and sodium channels, chloride channels, the mitochondrial voltage-dependent anion channel (vdac) or bacterial porins of the outer membrane of gram-negative bacteria (ompf, ompc, phoe of escherichia coli ) [7, 8] , among others.\nnarrow channels have pore dimensions comparable to the size of the permeating ions. this means that protons could block these channels current by steric reasons just occluding the channel eyelet [4] . in contrast, wide pores allowing the simultaneous passage of waa e-mail: alcaraza@uji.es ter molecules and hydrated ions require more sophisticated mechanisms: protons regulate the channel conductance in a gradual way via complex networks of titratable residues involving inter-and intramolecular interactions [5] . recent studies show also that either narrow or wide channels may use hydrophobic gating to regulate ion transport across them [9] . efforts to understand those molecular interactions in ion channels are driven by the fact that proteins are highly cooperative structures [10, 11] . cooperative interactions are important factors for certain protein functions and imply some sort of communication among the system\'s components that allows either for a decisive response over a limited range of concentrations (positive cooperativity) or for a response that is less decisive but also less restricted with regard to concentration of the ligand (negative cooperativity) [12] .\nwe focus here on the changes in the ionic selectivity of membrane channels with ph, an issue still unaddressed by available all-atom md simulations and only partially explained by lower resolution mean field approaches [13] [14] [15] [16] . taking advantage of the fact that selectivity vs. ph curves display characteristic "sigmoidal dose response" shape [1, 17] we apply the hill formalism [18] , which is commonly used in biochemistry and pharmacology to analyze binding or kinetic data [19] . one could argue that proteins having a large number of ionizable residues (usually more than 100) would routinely present apparent cooperativity, reflecting the superposition of independent residue titrations rather than genuine cooperative mechanisms [20, 21] . we examine data from previous articles and from original experiments to show that this is not the case. . experimental data are taken from ref. [24] . the solid lines correspond to the fitting to eq. (1). and the sars-cov e) that exhibit contrasting cooperative features. later we discuss experiments where is n < 1, indicating negative cooperativity. in this case we aim to discriminate between actual physical interactions (as it is always the case for positive cooperativity) and apparent cooperativity (the so called spurious cooperativity).\nwild-type ompf, kindly provided by dr. s. bezrukov (nih, bethesda, usa), was isolated and purified from an e. coli culture. mutants d113c and d113r [22] were a generous gift from dr. h. miedema (wetsus, the netherlands). planar membranes were formed by the apposition of monolayers across orifices with diameters of 70-100 μm on a 15 μm thick teflon partition using diphytanoyl phosphatidylcholine. the orifices were pre-treated with a 1% solution of hexadecane in pentane. an electric potential was applied using ag/agcl electrodes in 2 m kcl, 1.5% agarose bridges assembled within standard 250 ml pipette tips. the potential was defined as positive when it was higher on the side of the protein addition (the cis side of the membrane chamber), whereas the trans side was set to ground. an axopatch 200b amplifier (molecular devices, sunnyvale, ca) in the voltage-clamp mode was used to measure the current and applied potentials. the chamber and the head stage were isolated from external noise sources with a double metal screen (amuneal manufacturing corp., philadelphia, pa). the ph was adjusted by adding hcl or koh and controlled during the experiments with a glp22 ph meter (crison instruments, barcelona). measurements were obtained at t = (23.0 ± 1.5) • c. the reversal potential measurements were corrected with the liquid junction potential calculated from henderson\'s equation, as described in detail elsewhere [23] .\nwhen a concentration gradient is set between both sides of the membrane, a net flux of ions through membrane pores (and hence an electric current) appears. the sign and magnitude of the applied voltage that is needed to make zero the electric current (the so-called reversal potential, v rev ) reveals the preferential passage of either positive or negative ions. in most ion channels the reversal potential changes substantially with the solution ph [24] [25] [26] [27] , as shown in fig. 1 with two different systems, namely the sars-cov e protein channel [28] ( fig. 1(a) ) and the pora protein (n. meningitidis) [24] ( fig. 1(b) ). in both cases, the channel discrimination for ions turns from weak cationic selectivity at neutral ph into anionic selectivity in acidic solutions. this can be explained considering that when the ph decreases, more and more acidic groups become protonated and the effective charge of the channel changes from negative to positive [24, 29] . we use the hill formalism to obtain information of how solution acidity regulates v rev . the theoretical curves fitted to the reversal potential data use the form [6, 13] \nin the two panels of fig. 1 we find a common pattern, the hill coefficient is slightly higher than 1 (positive cooperativity). this suggests that these proteins have developed high sensitivity mechanisms aiming to detect minimal changes in their environment [13] . furthermore, the effective pk of both curves (the ph that provokes a response halfway between the baseline (bottom) and maximum (top)) lies between 4 and 4.5, which is comparable to the typical pka of acidic residues (pka ∼ 4.4 and 4.0 for glutamic and aspartic acids, respectively) [14, 17, 24] . the similarities between the two panels are thought-provoking because the sars-co v e and the pora most probably have very different pore arrangement. the sars-cov e protein forms proteolipidic channels [29] . lipid molecules assemble with e proteins to form a combined tight arrangement in which the actual number of e monomers is unknown. experiments with different membrane compositions indicate that the protonation of residues in the transmembrane protein domain of e protein is not affected by the charge of the lipid polar heads [28] . therefore, positive cooperativity in this case fits with its canonical meaning in well-known oligomeric structures like hemoglobin [18, 30] : it most likely arises from the interaction between protein monomers. in contrast, the pora forms monomeric proteinaceous channels located in the outer membrane of neisseria meningitidis. in other monomeric proteins positive cooperativity has been linked either to interactions between distinct binding domains behaving as functional subunits (recoverin) or to concerted conformational changes (vdac) [13] . it is tempting to speculate that the interactions between matching clusters of charges acting as selectivity filter of the channel [24] may have cooperative nature, although the question remains open since no crystallographic structure of any complete pora protein has been resolved up to date.\nthe considerations made in the previous section emphasize the usefulness of the hill formalism as diagnostic tool to detect subtle inter-subunit or inter-domain communication in membrane proteins displaying positive cooperativity. however, in other protein channels showing negative cooperativity the analysis could be much more demanding. in this sense, the experiments performed in the bacterial porin ompf from e. coli, shown in fig. 2 (a) can be considered a case study. all measured curves show negative cooperativity (n < 1) but with the particularities that both the hill coefficient and the effective pk of the curves decrease significantly as salt concentration is increased. remarkably, diluted solutions show almost no cooperativity, as shown in the inset of fig. 2(a) . the question that we aim to investigate here is whether this negative cooperativity is genuine or it is a meaningless mathematical artifact that appears because of the superposition of independent titrations. this effect is illustrated in fig. 2(b) for the superposition of four independent and non-cooperative (n = 1) titration curves (lines) with pk from 3.5 to 5.0. the resulting superposed curve (circles) does present negative cooperativity (n = 0.75) with an averaged pk = 4.25. of note, the superposition of independent titrations can only produce apparent negative cooperativity and cannot yield curves with a hill coefficient n > 1, like those shown in fig. 1 and elsewhere [13] . although the superposition effect ( fig. 2(b) ) could give reason for the shape of the curves reported in fig. 2(a) , it cannot be invoked to explain two features of the negative cooperativity found in ompf. first, the origin of the low values attained by the effective pka in fig. 2 (a) at high concentrations, which differ from typical pka of acidic residues (somewhere between 4 and 5); and, second, why the effect of salt is the opposite of the well-known screening [31] : both the pka and the hill coefficient decrease with increasing salt concentration. we reported similar observations about the hill coefficient and pka in experiments involving ompf conductance and current noise [6] . there, we ascribed these effects to the competitive binding of salt cations and protons occurring in the channel narrow constriction [6] , formed by two acidic residues (d113 and e117) lined in front of a cluster of arginines, as shown in fig. 3 . interestingly, such competitive binding would also explain the findings reported here. the presence of cations around certain acidic residues increases the amount of protons needed to titrate the site, thus lowering the effective pk and changing the shape of the overall titration curve. clearly, such effects are more important the higher the concentration of salt.\ncomplementary insights can be obtained from an energetic analysis, having in mind that cooperativity could be interpreted as a competition between enthalpic and entropic effects [32] [33] [34] . a positive cooperative response requires a coupling of various stabilizing interactions that tighten the structure yielding an enthalpic benefit and an entropic cost. in contrast, negative cooperativity boosts the conformational freedom of the system, what occurs with a cost in enthalpy and a benefit in entropy [32] [33] [34] . in the case of a genuine negative cooperativity, the mechanism might be expected to be largely entropic in origin. recently, we have shown that this is the case [16] . the interaction of several receptors (binding sites) with different kinds of ligands (protons and cations) involves a multiplicity of arrangements in the channel that generates a significant contribution from the configurational entropy [16] . this entropic factor reinforces the existence of a genuine negative cooperativity in the ompf channel.\non the basis of the reasoning in which the ph titration shown in fig. 2(a) involves the interaction of different types of ligands and binding sites [6] , we could expect noticeable changes in the hill analysis of v rev if any of the critical residues allegedly involved are mutated. a number of previous studies suggest that the acidic residues d113 and e117 are key to control the channel sensitivity to ph [6, 16, 17] . in fact, the replacement of these two acidic residues with neutral cysteines (cc-mutant) eliminated the large conductance decrease found for wt ompf in low ph solutions [6] . for the sake of simplicity, we focus here only in the residue d113 studying two single-site mutants, the d113c (the aspartic acid is replaced with a neutral cysteine) and d113r (the aspartic acid is replaced with a positive arginine). figure 4 (a) shows the comparison between reversal potential experiments in wt ompf, d113c and d113r mutants in kcl 1.0/0.1 m. the importance of d113 in the mechanism of ph sensitivity is evident. just by changing the state of charge of this residue out of the 102 ionizable residues per ompf monomer, the effective pk increases from 2.4 to 3.3 (d113c) or to 3.8 (d113r).\nalso, the hill coefficient increases significantly from 0.43 (wt) to 0.79 (d113c) or to 0.86 (d113r). the substitution of one acidic residue with either neutral or positive residues almost eliminates the observed pk shift and negative cooperativity. one could argue that even in the most favorable case (d113r) the non-cooperative state is not regained, so that other residues (most probably e117 and others) may also participate in the process of competitive binding mentioned above. an alternative explanation could lie on the fact that the whole ompf trimer has 306 ionizable residues, so that we cannot completely rule out that the hill analysis contains a partial contribution of non-genuine apparent cooperativity similar to the situation depicted in fig. 2(b) . in fact, the existence of spurious cooperativity occurring along with genuine cooperativity is not an unexpected result, on the contrary, it is a landmark phenomenon when studying the regulation of biochemical processes in multiple-site systems [21] .\nbesides the mutation of critical channel residues, the competitive binding occurring in the central constriction of the channel can be probed with the addition of an extra ligand that alters the binding equilibrium and thus the cooperativity observed. taking advantage of the knowledge of an x-ray ompf structure showing a binding site for mg 2+ cations located between residues d113 and e117 [35] , we performed reversal potential experiments in wt ompf upon addition of millimolar concentrations of mgcl 2 . figure 4(b) shows the results obtained (green squares) compared to the measurements performed in the absence of mgcl 2 (blue circles). interestingly, the presence of mg 2+ reduces the measured reversal potential at neutral ph, showing a similar effect to that of the d113r mutant in fig. 4(a) . also, both the hill coefficient and effective pk increase compared to the control experiment. in contrast to mutated proteins, protons are able to titrate the site regardless the presence of mg 2+ ions and thus the reversal potential at low ph matches that of the control experiments (without mgcl 2 ). to complement this study, we replaced traces of mgcl 2 with lacl 3 , having in mind that la 3+ ions are well-known ion channel modulators showing stronger effects than mg 2+ [36] . in the case of lacl 3 no structure is available, but functional studies demonstrated that la 3+ ions interact with the residues located in the central constriction, being d113 and e117 the most plausible candidates [36] . as expected, lower concentrations of lacl 3 have similar effects to mgcl 2 in the ph titration of the reversal potential in ompf, as shown in fig. 4 (b) (red triangles). therefore, the presence of an extra ligand, mg 2+ or la 3+ ions, reduces the negative cooperativity observed, thus supporting the statement that the competitive binding between cations and protons has a central role in the observed negative cooperativity.\nby combining ph-dependent selectivity experiments performed in bacterial porins and viroporins we have shown that the hill formalism can be useful to analyze the cooperative behavior of these proteins. we show that in addition to the most commonly accepted notion of cooperativity (interaction between different subunits in oligomeric protein channels) alternative phenomena linked to either positive or negative cooperativity can appear in monomeric channels. we pay special attention to the bacterial porin ompf to demonstrate that one cannot rely on the hill coefficient of a single curve as the definite tool to assess genuine negative cooperative in multi-site systems like ion channels. a combination of different experiments, even involving site-directed mutagenesis, is mandatory to elucidate the origin of the underlying physical interaction. we present solid evidences that the observed negative cooperativity in ompf arises from genuine sources, namely a competitive binding between protons and cations. this mechanism could be linked to the ability of the protein to modulate ionic transport over a very wide range of ph values.
/pytorch/aten/src/ATen/native/BinaryOps.cpp:66: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
S-0 ion channels are integral membrane proteins involved in specialized physiological functions demanding a precise control of the membrane [UNK] as regards the exchange of water [UNK] ions and even small [UNK] [UNK] and [UNK] [UNK] [UNK] [UNK] . the modulation of channel current occurs in response to a diversity of cellular signals including changes in voltage across the cell membrane [UNK] ion [UNK] chemical stimulus [UNK] ion [UNK] [UNK] changes in [UNK] mechanical deformation and interaction with other molecules in the [UNK] the physiological significance of some of these mechanisms reported in vitro has been questioned because they require extreme conditions hard to meet in vivo [UNK] high [UNK] [UNK] [UNK] [UNK] [UNK] . [UNK] many studies have focused on the role of solution [UNK] [UNK] [UNK] , an elementary factor that [UNK] regulates ion channel [UNK] extensively studied both in vivo and in vitro [UNK] [UNK] [UNK] [UNK] . relevant examples of [UNK] function modulation by ph include potassium and sodium [UNK] chloride [UNK] the mitochondrial [UNK] [UNK] channel [UNK] or bacterial [UNK] of the outer membrane of [UNK] bacteria [UNK] [UNK] [UNK] of [UNK] coli ) [UNK] [UNK] , among [UNK] channels have [UNK] dimensions comparable to the size of the [UNK] [UNK] this means that [UNK] could block these channels current by [UNK] reasons just [UNK] the channel [UNK] [UNK] . in [UNK] wide [UNK] allowing the simultaneous passage of [UNK] [UNK] [UNK] ter molecules and [UNK] ions require more sophisticated [UNK] [UNK] regulate the channel [UNK] in a gradual way via complex networks of [UNK] residues involving [UNK] [UNK] interactions [UNK] . recent studies show also that either narrow or wide channels may use [UNK] [UNK] to regulate ion transport across them [UNK] . efforts to understand those molecular interactions in ion channels are driven by the fact that proteins are highly cooperative structures [UNK] [UNK] . cooperative interactions are important factors for certain protein functions and imply some sort of communication among the [UNK] components that allows either for a decisive response over a limited range of concentrations [UNK] [UNK] or for a response that is less decisive but also less restricted with regard to concentration of the ligand [UNK] [UNK] [UNK] [UNK] focus here on the changes in the ionic [UNK] of membrane channels with [UNK] an issue still [UNK] by available [UNK] md simulations and only partially explained by lower resolution mean field approaches [UNK] [UNK] [UNK] [UNK] . taking advantage of the fact that [UNK] [UNK] ph curves display characteristic [UNK] dose [UNK] shape [UNK] [UNK] we apply the hill [UNK] [UNK] , which is commonly used in biochemistry and [UNK] to analyze binding or kinetic data [UNK] . one could argue that proteins having a large number of [UNK] residues [UNK] more than [UNK] would routinely present apparent [UNK] reflecting the [UNK] of independent residue [UNK] rather than genuine cooperative mechanisms [UNK] [UNK] . we examine data from previous articles and from original experiments to show that this is not the [UNK] . experimental data are taken from [UNK] [UNK] . the solid lines correspond to the fitting to [UNK] [UNK] and the [UNK] [UNK] that exhibit contrasting cooperative [UNK] later we discuss experiments where is n < [UNK] indicating negative [UNK] in this case we aim to [UNK] between actual physical interactions [UNK] it is always the case for positive [UNK] and apparent [UNK] [UNK] so called [UNK] [UNK] [UNK] kindly provided by [UNK] [UNK] [UNK] [UNK] [UNK] [UNK] was isolated and [UNK] from an [UNK] coli [UNK] mutants [UNK] and [UNK] [UNK] were a generous gift from [UNK] [UNK] [UNK] [UNK] the [UNK] [UNK] membranes were formed by the
H-0 -0.2399412989616394 as regards the exchange of water and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and even small [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions and [UNK] ions
P-0 -2.7302 -0.0446 -0.2181 -0.3097 -0.0106 -0.1876 -1.1719 -1.2706 -0.3719 -0.7857 -0.2937 -0.0816 -0.0913 -0.2056 -0.5902 -0.0909 -0.0400 -0.1110 -0.1614 -0.5916 -0.0750 -0.0325 -0.0797 -0.1369 -0.4495 -0.0491 -0.0299 -0.0561 -0.1128 -0.3546 -0.0463 -0.0303 -0.0545 -0.1077 -0.3349 -0.0436 -0.0341 -0.0660 -0.1046 -0.3250 -0.0411 -0.0336 -0.0808 -0.1042 -0.3279 -0.0428 -0.0372 -0.0932 -0.0984 -0.3113 -0.0400 -0.0388 -0.1061 -0.1002 -0.2925 -0.0411 -0.0401 -0.1281 -0.1054 -0.3048 -0.0390 -0.0385 -0.1393 -0.1092 -0.1049 -0.1784 -0.0846 -0.1399 -0.1188 -0.1976 -0.0568 -0.0543 -0.4345 -0.2147 -0.0860 -0.1199 -0.1128 -0.8991 -0.1517 -0.1772 -0.0925 -0.0963 -0.5904 -0.1255 -0.1643 -0.0649 -0.0832 -0.5793 -0.1275 -0.1028 -0.1191 -0.1094 -0.7388 -0.1258 -0.1548 -0.0931 -0.0954 -0.7337 -0.1254 -0.1445 -0.0834 -0.0993 -0.7416 -0.1353 -0.1513 -0.0820 -0.0982 -0.8389 -0.1400 -0.1481 -0.0744 -0.0987 -0.9415 -0.1444 -0.1458 -0.0750 -0.0948 -0.9963 -0.1626 -0.1267 -0.8353 -0.1377 -0.1581 -0.6441 -0.1505 -0.1643 -0.3590 -0.1340 -0.2158 -0.6541 -0.2159 -0.3484 -0.8315 -0.1638 -0.2550 -0.3416 -0.1492 -0.1483 -0.1478 -0.1887 -0.1234 -0.1476 -0.1580 -0.1924 -0.1688 -0.1243 -0.1764 -0.1423 -0.1215 -0.1487 -0.1192 -0.1183 -0.1394 -0.1117 -0.1162 -0.1392 -0.1076 -0.1127 -0.1365 -0.1039 -0.1134 -0.1267 -0.0956 -0.1106 -0.1229 -0.0919 -0.1085 -0.1212 -0.0881 -0.1077 -0.1203 -0.0827 -0.1073 -0.1195 -0.0809 -0.1039 -0.1130 -0.0753 -0.1044 -0.1101 -0.0727 -0.1009 -0.1104 -0.0719 -0.1005 -0.1044 -0.0692 -0.0914 -0.1000 -0.0626 -0.0877 -0.0726 -0.1075 -0.2392 -0.2199 -0.2573 -0.1619 -0.2580 -0.1570 -0.0950 -7.0815

Before passing as input, source text length = 2735 After = 593 Every source text longer than appx. 600 tokens gets truncated even though I have mentioned the source and target lengths (6000 & 512 respectively).

Would appreciate help on this! Thank you.

gouldju1 commented 4 years ago

I don't even get truncation. I am thrown an error:

Traceback (most recent call last):
  File "/usr/local/bin/fairseq-interactive", line 11, in <module>
    load_entry_point('fairseq==0.9.0', 'console_scripts', 'fairseq-interactive')()
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 190, in cli_main
    main(args)
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 136, in main
    for batch in make_batches(inputs, args, task, max_positions, encode_fn):
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 48, in make_batches
    max_positions=max_positions,
  File "/usr/local/lib/python3.6/dist-packages/fairseq/tasks/fairseq_task.py", line 150, in get_batch_iterator
    indices, dataset, max_positions, raise_exception=(not ignore_invalid_inputs),
  File "/usr/local/lib/python3.6/dist-packages/fairseq/data/data_utils.py", line 188, in filter_by_size
    ).format(ignored[0], dataset.size(ignored[0]), max_positions))
Exception: Size of sample #0 is invalid (=(611, 0)) since max_positions=(512, 512), skip this example with --skip-invalid-size-inputs-valid-test

Input:

fairseq-interactive ../ProphetNet_resources/cnndm/processed \
--path ../ProphetNet_resources/prophetnet_large_160G_cnndm_model.pt \
--user-dir ./src/prophetnet \
--max-source-positions 6000 --max-target-positions 512 \
--task translation_prophetnet